Posts by Rui Barradas • 15,422 points
432 posts
-
1
votes2
answers206
viewsA: r - average of one variable relative to the values of another variable in a data frame and take NA values
The question is quite confused. Ask by means of frequencia grouped by campanha and then only gives examples of code where the grouping is by local and especie. I’ll group first by campanha.…
-
3
votes1
answer446
viewsA: R - Create Binary Variable (dummy) value 1 for 50% of the total
I believe that the following code solves the problem of the question. First I define a function that processes class columns numeric and creates each dummy. It does this by adding the values from…
-
1
votes1
answer112
viewsA: Linear regression looping in r with alteration in the variable y
You can do whatever you want without cycles for, cyclically *apply. First, I’ll redo the base, since it’s easier to have all the columns in the same data.frame. base <- X1 base$X2 <- X2$X2…
-
2
votes2
answers1338
viewsA: r - sum of a variable relative to the values of another variable in a data frame
This can be solved with the function aggregate. res <- aggregate(frequencia ~ campanha + especie, dados, sum) res # campanha especie frequencia #1 1 A 11 #2 2 A 19 #3 1 B 11 #4 2 B 13 #5 1 C 16…
-
2
votes1
answer123
viewsA: Is there any way to get the standard deviation (R) percentage?
If you want to know the standard deviation relative to the mean, this is the coefficient of variation. To have in percentage just multiply by a hundred. Programming the CV in R is a very easy…
ranswered Rui Barradas 15,422 -
3
votes2
answers634
viewsA: Boxplots - rstudio
First we have to know how many boxes there are in this chart. Just copy the instruction boxplot and replace the graphical command with length(list(etc)).…
-
8
votes1
answer1856
viewsA: How to take Dunnett’s test on R?
To take the Dunnet test, you can use the function glht package multcomp. As there is no data in the question I will use the base iris. Immediately before running the test the random number generator…
ranswered Rui Barradas 15,422 -
3
votes1
answer34
viewsA: Operation with 2 columns dynamically
The following code does what the question asks. Uses only R base and I believe it does not depend on the data frame to process. First a data frame for testing. set.seed(4961) # Torna os resultados…
-
3
votes2
answers1414
viewsA: In R, how to calculate the average of one column based on another?
There are many ways to do what you want. But first the data. set.seed(941) # Torna os resultados reprodutíveis Data <- c("3/1/2005", "4/1/2005", "5/1/2005", "6/1/2005", "14/2/2006", "15/2/2006",…
ranswered Rui Barradas 15,422 -
4
votes1
answer611
viewsA: A - How to create a delayed variable (lag) conditioned to the individual?
There is a simpler way to do what the question asks. Instead of Pipes %>%, use ave. Note: the function lag which will be executed is that of the package dplyr. library(dplyr)…
-
10
votes2
answers1974
viewsA: Modify gradient colors in graphs in ggplot2
My solution is very similar to @Marcus Nunes, but with a difference that seems important to me, so I also decided to answer. The difference is in the color vector used in geom_label. To have…
-
6
votes2
answers95
viewsA: Generate a date after a specific date
There are several things wrong with your code. First, the following two instructions are equivalent, since the argument format is "%Y-%m-%d" default: as.Date(base$DATE_END, "%Y-%m-%d")…
-
1
votes2
answers3207
viewsA: Replace specific column values with NA
In addition to your solution in comment, which is completely vectorized, there is another vector that I believe is more readable. First an example dataset. set.seed(5139) # Torna os resultados…
-
4
votes3
answers11076
viewsA: Turning factor into numeric R
This problem is quite frequent when working with class objects data.frame, which is the class of objects that functions read.table and derivatives produce. To avoid this, just see that default the…
-
2
votes1
answer1024
viewsA: How to compare two columns of a spreadsheet, and keep the information that is equal?
Maybe this is what you want. Note that the elements common to the two columns are the first elements of the vector Coluna_C, whatever its position in the original vectors, Coluna_A or Coluna_B.…
-
1
votes3
answers1583
viewsA: function in R that also returns the execution time itself
See if the following is what you want. The result of the base R function proc.time is obtained in the first function instruction teste and then subtracted from proc.time in the end. teste <-…
-
2
votes2
answers154
viewsA: Use of if and Else
Firstly, as stated in the comments, the code is not reproducible. For two reasons: 1) we have no data to test solutions; 2) in the question it is said that it is a function but we do not know how it…
ranswered Rui Barradas 15,422 -
1
votes1
answer888
viewsA: Box Plot with standard deviation
Here’s how you can produce box-and-moustache charts (box-and-Whiskers Plots or boxplots) with the mean as a measure of central trend and with the standard deviation as a measure of variability.…
ranswered Rui Barradas 15,422 -
4
votes2
answers101
viewsA: How to make a loop/routine for the write.fst() function?
To do what you want, it’s best to use the lapply applying to each element of the vector Dados the anonymous function that reads the files fl. The value of lapply is a class object list, and each…
-
2
votes1
answer167
viewsA: R - gtrends: ISO language code "en" or "en-BR" do not work?
It seems that some users are having problems and others (like me) do not. This answer should not solve the problem, it only serves to show which output I get. library(gtrendsR) gtrend1 <-…
ranswered Rui Barradas 15,422 -
4
votes2
answers237
viewsA: How to create an array with repeating dates?
Despite your answer to your question, I will also answer. This is for two reasons:. In the question data each day is repeated 6 times and with by = 0.5 only repeats two. There are also the missing…
ranswered Rui Barradas 15,422 -
3
votes3
answers743
viewsA: Count of TRUE and FALSE
In addition to the simple answers that have already been given, there is one more complicated that can be useful when we only need to know one of or how many TRUE or how many FALSE. For this, you…
-
4
votes1
answer312
viewsA: plot time series graph where on x-axis show every year
To do what you ask, you have to start by not including the axes with the argument axes = "n". Then use the function axis to annotate the axes where and how you want. First I will create a time…
-
5
votes2
answers221
viewsA: Create a vector with the Levels of a factor in r
You can simplify the code of @Willianvieira. In the following code is not used unique, only levels to directly extract the factor levels. I’ll use the same data example. x <- as.factor(rep(1:13,…
-
3
votes1
answer75
viewsA: How to transform a written sequence into a numerical sequence? (R)
To transform any alphanumeric string of the type x não_número y, with x and y two integers in the sequence x:y, can be done as follows. x <- "32 à 38" y <- unlist(strsplit(x, "[^[:digit:]]+"))…
ranswered Rui Barradas 15,422 -
6
votes2
answers4592
viewsA: How do you turn a comma number into an R?
I believe that what you want should be solved with sub and not with gsub. x <- c("123,45", "456,78", "0,001") y <- sub(",", ".", x) y [1] "123.45" "456.78" "0.001" as.numeric(y) [1] 123.450…
ranswered Rui Barradas 15,422 -
1
votes1
answer47
viewsA: Separate values from a list of summaries in R
I believe this can be done with successive applications of lapply. The function to be applied is the extraction function [[. In the case of RMSEA, in my tests gave a matrix with 4 lines, therefore I…
-
1
votes1
answer99
viewsA: How to run a looping in R and store the results of a summary in a vector
The problem should be able to be solved with a function that runs the question code, called 20 times with lapply. The results are stored in smry_list. sem_smry <- function(DF, cfa, p){ inx <-…
-
4
votes2
answers300
viewsA: How to create a data frame of a database based on the difference of two dates in a column of another categorical variable in the R software
You can do whatever you want with the base R function aggregate. Grupo <- c("A", "A", "A", "B", "B", "C", "C") Data <- c("01/02/2017", "15/02/2017", "20/03/2017", "18/02/2017", "01/03/2017",…
-
3
votes1
answer337
viewsA: Random choice of lines in an array in R
To select at random p numbers of m, the easiest way is to use the function sample. set.seed(1234) # Faz os resultados reprodutíveis m <- 7 mat <- matrix(rnorm(35), nrow = m) p <- 4 inx…
-
7
votes1
answer171
viewsA: A - How to calculate the price variation for different periods and companies?
The function ave was made to solve these kinds of problems. dados$Variação <- ave(dados$Preço, dados$Empresa, FUN = function(x) c(x[1], diff(x))) DICE. dados <- structure(list(Empresa = c(1,…
-
1
votes2
answers3944
viewsA: How to transpose rows into columns in a data frame?
I believe that the simplest form is still with the base R function xtabs. result <- xtabs( ~ ID + mes, dados) head(result) # mes #ID ago set # 1 1 0 # 2 1 0 # 3 1 0 # 4 1 0 # 5 1 0 # 6 1 0…
ranswered Rui Barradas 15,422 -
3
votes1
answer127
viewsA: How to calculate monthly Cvar in R?
I believe I have managed to do something similar to what the question asks. If it’s not the following, maybe I can adapt to your problem. The trick is to divide the data by month, using the base…
ranswered Rui Barradas 15,422 -
5
votes3
answers4047
viewsA: A: How to count and sum the amount of a certain "factor" in the observations (lines) of a data.frame?
I think the simplest way is with rowSums. Like the comparisons == result in logical values FALSE/TRUE that the R encodes as 0/1, just add the values in each line. rowSums(Base[, 2:11] == "Sim") #[1]…
-
3
votes1
answer120
viewsA: Calculating rates in R
I believe the following does what it wants, except graphs. First of all, I turned the columns Sexo and Ocupaçao in factors and assigned them labels descriptive, Masculino/Feminino and…
-
1
votes2
answers60
viewsA: Changing variable value
One possibility is to use the function is.na<-. First, read the question data. cv1 <- scan(text = "0 0 108919 4152 317 334403 0 35092 12762 NA") Now, turn the zero values into NA. cv2 <-…
ranswered Rui Barradas 15,422 -
5
votes2
answers1149
viewsA: How to delete rows from a Data Frame in R based on the values of one of the columns?
To search for alphanumeric patterns, it is best to use the grep or grepl. set.seed(6323) # Torna os resultados reprodutíveis n <- 100 DF <- data.frame(A = sample(LETTERS, n, TRUE), X =…
ranswered Rui Barradas 15,422 -
3
votes1
answer92
viewsA: Doubts ggplot in bars
I believe that to create this graph it is necessary to change the data.frame from wide to long. See Reshaping data.frame from wide to long format. library(ggplot2) dados$dia <- as.Date(dados$dia,…
-
4
votes1
answer79
viewsA: Correlation between two dose-response R curves?
I don’t know if this is what you want, but "the correlation between two curves" can be given with the following code. First we get the points of the curves with predict (actually the method for…
ranswered Rui Barradas 15,422 -
6
votes2
answers1901
viewsA: In R, How to calculate the average of a column based on criterion in another column?
It is a problem of selecting lines from a data frame by a logical condition: set.seed(6480) # Para ter resultados reprodutíveis n <- 50 dados <- data.frame(A = runif(n, 0, 100), B = runif(n,…
-
5
votes1
answer98
viewsA: R - How to sample pairs from an array without repeating values?
It’s easier to do that than you think. More exactly, you don’t have to use the replicate. Suffice it to see that sample(vetor) produces a permutation of your argument. (No repetitions.)…
-
1
votes1
answer45
viewsA: Unique numbering in data frame
To solve the problem I will use the strategy of split-apply-combine several times, with a single instruction at a time. First, we create the column Contagem with the function ave. dados$Contagem…
ranswered Rui Barradas 15,422 -
3
votes1
answer3255
viewsA: Calculate difference between dates in days on the R
First of all, I’m not sure what the names of your data are, I’ll call the data frame time1 and to your column entrada. If this isn’t right, say you just change the names, the code is still good.…
-
7
votes2
answers865
viewsA: Download CSV file using R
This is a file CSV standard but some columns need further processing. Try the following. str2num <- function(x){ x <- gsub(",", "", x) as.numeric(x) } URI <-…
-
5
votes1
answer214
viewsA: Similarity of Texts
I believe the following code answers the question. First I’ll read the data, since we don’t have access to the file exemploTeste.csv. Nome <- scan(what = character(), text = " 'heber dos Santos…
-
3
votes2
answers133
viewsA: Indicator on R with more than one condition with duplicate values
If the problem description is correct and the expected result example is not, the following code solves the question. i1 <- grepl("Cooperativa|Banco", dados$IF, ignore.case = TRUE) i2 <-…
-
3
votes3
answers152
viewsA: Aggregate string in R
I believe that this code answers the question, but we need to pay attention to the following: in the desired result, which is in the question, the line ALFREDO has the spine b equal to SICRED when…
ranswered Rui Barradas 15,422 -
3
votes2
answers331
viewsA: Filter Different texts at different positions in R
Try the following. First we use the gsub to obtain only the numbers in dados$NOME. Then we filter with a logical index. num <- as.numeric(gsub("[^[:digit:]]", "", dados$NOME)) dados2 <-…
-
5
votes2
answers899
viewsA: Renaming the levels of a factor based on a data frame
I believe the following code settles the question. However, I had some problems with the columns involved because they are class factor. First, it includes the argument stringsAsFactors in the…
-
3
votes1
answer373
viewsA: Convert monetary values "R$" to double type
Try to apply the function converteeach of the columns with the problematic values. converte <- function(x) as.numeric(sub(",", "\\.", gsub("R\\$|\\.", "", x))) x <- "R$ 5.500,00" converte(x)…