Posts by Rui Barradas • 15,422 points
432 posts
-
0
votes1
answer209
viewsA: How to winsorize database by group
I’ll use the function winsor package psych to take advantage of the data. To answer the question, just use one of the ways R has to apply group functions. R base. Based on R function ave was made…
ranswered Rui Barradas 15,422 -
4
votes2
answers88
viewsA: Create a column with the second highest row value
Here are two functions that compute the second largest element in each row of a table. The second function uses the knowledge of which is the maximum value on each line to calculate the 2nd largest.…
ranswered Rui Barradas 15,422 -
0
votes2
answers88
viewsA: Select first with conditional
With R base can be as follows. First transform the column data in a class column "Date". x$data <- as.Date(x$data) Now, first get an index of the table rows ordered by date and then extract the…
-
3
votes2
answers220
viewsA: Interaction graph using ggplot2
As I said in comment to the question, the mistake is that G is not a dataframe variable interaction and therefore cannot be colour = G. Just switch to select(Peso, Tempo, S, G) and group_by(S, G,…
-
3
votes2
answers233
viewsA: How do I chart the production function of Cobb-Douglas in the R?
This answer needs the package plot3D installed. If not already installed, you can do so with install.packages("plot3D") First I define a function cobbDouglas quite general, which also accepts as…
ranswered Rui Barradas 15,422 -
2
votes3
answers121
viewsA: How to exclude a specific row from a base in R
With subset as in the question will be subset(base, !fraude %in% c("N", "P", "K")) It is more effective in terms of performance to use a logical index. i <- !base$fraude %in% c("N", "P", "K")…
ranswered Rui Barradas 15,422 -
3
votes1
answer69
viewsA: Insert names next to bubbles of a Bubble Chart into the R and exaggerate the difference between them
Just include the argument size in geom_label_repel to get what you want. First I’ll redo the base, with data.frame, the most natural way to create it. nome <- c("a","b","c","d") anos <-…
ranswered Rui Barradas 15,422 -
3
votes2
answers24
viewsA: Include "." or "," from right to left of integers
The functions formatC and sprintf can be useful for this problem. formatC(as.numeric(x)/100, digits = 2, decimal.mark = ".", format = "f") #[1] "17818.18" "1781818.00" "925617818.18" #[4] "17818.10"…
ranswered Rui Barradas 15,422 -
3
votes1
answer1547
viewsA: How to change the date format from "year/month/day" to "year/month" in R?
Here are two different ways to convert a date to "ano/mês". R base. Use the method format.Date for class objects "Date". format(Sys.Date(), "%Y-%m") #[1] "2019-04" Bundle zoo. The function…
-
3
votes1
answer826
viewsA: How to filter data in a data.frame using a certain amount of time in R?
The following function filters the data by date, passed as argument mes. It is assumed that the date format is year-month-day, and therefore a full date must be passed, but it can be any other…
-
2
votes1
answer58
viewsA: merge dataframe by lines
On R base, one can do this with merge. df3 <- merge(df1, df2, by = c('UF', 'Ano')) df3[[3]] <- df3[[3]] - df3[[4]] df3 <- df3[-4] df3 # UF Ano Valor.x #1 AC 2007 1869351 #2 AC 2008 2008371…
ranswered Rui Barradas 15,422 -
2
votes2
answers94
viewsA: Select Time in R
Only on R basis. DADOS$DATA_INICIO <- as.POSIXct(DADOS$DATA_INICIO, format = "%d/%m/%Y %H:%M") DADOS$HORA <- format(DADOS$DATA_INICIO, "%H") DADOS # MATRICULA DATA_INICIO HORA #1 111…
ranswered Rui Barradas 15,422 -
4
votes1
answer196
viewsA: Coloring specific points in Ggplot - R
You can do it with one more geom_, to geom_point. However, to have only the points referring to Brazil, you have to select a subset of the data, in this case with subset. library(tidyverse)…
-
2
votes1
answer216
viewsA: Insertion of intervals with the if Else condition structure
Are not necessary if nor cycles for or *apply. Here are two ways to do what you want. 1. One can use the cut. my_fun2 <- function(x){ menor_que_5 <- 5 - .Machine$double.eps^0.5 brks <-…
ranswered Rui Barradas 15,422 -
4
votes1
answer250
viewsA: Rotate names on x-axis
The trick is not to plot the axis in question, in this case the axis of x with xaxt = "n" and then use the return value of barplot to annotate the axle with text. Note the argument par('usr')[3], is…
ranswered Rui Barradas 15,422 -
4
votes1
answer86
viewsA: How to make the output of the Kable() command appear in Rstudio Viewer?
The function view_kable below is inspired in this answer of the OS in English. view_kable <- function(x, format = "latex", ...){ tab <- if(format == "latex")…
-
4
votes2
answers5826
viewsA: Use of Seed in R
The function set.seed is used to reproduce the results of pseudo-random number generators (RNG). This is important to have data analysis results in which Rngs generators are used. For example, when…
ranswered Rui Barradas 15,422 -
3
votes2
answers186
viewsA: Calculating columns with conditional in R
A vector way is the following. It uses logical indexes to modify the column Valor. df$Valor[df$DebCred == 'C'] <- -1*df$Valor[df$DebCred == 'C'] df$Valor[df$DebCred == 'D'] <-…
ranswered Rui Barradas 15,422 -
0
votes2
answers74
viewsA: Renaming string text in a column
I believe the code below does what you ask. Note that the data from dput do not have the same structure as the tables of the question. I will use the data of the dput and create a table as the…
ranswered Rui Barradas 15,422 -
2
votes1
answer42
viewsA: Problem with "ifelse" - Object cannot be coerced to type 'double
The object a is class "list" because it is the result of lapply of the previous instruction. So just apply lapply once again. a_if <- lapply(a, function(.a) ifelse(.a <= -2, "Z", "N")) If you…
ranswered Rui Barradas 15,422 -
3
votes2
answers117
viewsA: I could not resolve the error: The condition has length > 1 and only the first element will be used
A fully vectored way, which in R is always a good idea, is the following. I start with a borrowed line of code from answer by Willian Vieira, to create the table Dados3. Dados3 <- Dados2 <-…
-
1
votes1
answer46
viewsA: r Loop by script condition
This is far from being a complete answer but if you want to stay only with the months with 4 or more lines, you can start with this code. library(tidyverse) temp.group <- temp %>%…
-
1
votes1
answer71
viewsA: How to make a program in R that provides the result below?
The question equation can be programmed as follows. media_amos <- function(z, na.rm = TRUE){ if(na.rm) z[is.na(z)] <- 0 n <- nrow(z) cmb <- choose(n, 2) z[row(z) == col(z)] <- 0…
ranswered Rui Barradas 15,422 -
3
votes2
answers52
viewsA: R regression on same line data
It’s not as hard as that. In the solution below I calculate the logarithm of each column except the first one at the beginning. Applique (apply) the model lm(x ~ y) every row. And then there are…
-
2
votes1
answer40
viewsA: accent str_extract()
As regular expressions are not the same everywhere, depend on the language or country, the locale of the system. From the help page of regex, with link above: The only Portable way to specify all…
-
3
votes1
answer243
viewsA: How to use filter_functions?
First of all I will redo the data with set.seed to make results reproducible and with the argument stringsAsFactors = FALSE, to answer the last question. set.seed(1234) data_1 <- data.frame( a =…
-
3
votes2
answers176
viewsA: TRI - using MIRT
The maximum number of iterations is documented in the package help page, function mirt. Of help("mirt"). Technical a list containing lower level technical parameters for estimation. May be: NCYCLES…
ranswered Rui Barradas 15,422 -
2
votes2
answers840
viewsA: Group data by a certain column in the R
In R base you can use the aggregate. Note that I called the data dados. aggregate(country ~ continent, dados, function(x) length(unique(x))) # continent country #1 Africa 2 #2 Americas 1 #3 Asia 1…
ranswered Rui Barradas 15,422 -
3
votes2
answers71
viewsA: Indexing of Dataframe
You can do whatever you want with the base R function aggregate. fl <- list.files(pattern = "^cursos-prouni.*\\.csv$") Prouni <- read.csv(fl) str(Prouni) fmla <- mensalidade ~ uf_busca +…
ranswered Rui Barradas 15,422 -
4
votes2
answers124
viewsA: R - problems converting txt to read in R
The problem is in the first element of the column V1 be an alphabetical value ("X"). So R reads the whole column as being: class "factor" if you have stringsAsFactors = TRUE or have nothing, since…
ranswered Rui Barradas 15,422 -
3
votes1
answer643
viewsA: Convert String to float in R
I think you’re confusing the number and output of the method print for class objects "numeric", which is the method print.default. The number you have is 3.00. Otherwise see first how it is…
ranswered Rui Barradas 15,422 -
4
votes2
answers492
viewsA: Barplot with labels
To get what you want the best is to use the function output value barplot as coordinates x of the text. First the data. The scan reads what is in the question. And seq_along avoids such manual work…
ranswered Rui Barradas 15,422 -
2
votes1
answer66
viewsA: Same data set, two lines, two equations, on the same graph?
To trace two regression lines, you first have to create an extra variable, which tells what part each data group belongs to, if x < 700, if x >= 700. library(ggplot2) df$group <-…
-
3
votes2
answers308
viewsA: Replace numeric values of one vector with another value in a data frame
The following function does what the question describes. fun_replace <- function(x, vetor, novo = 48){ res <- lapply(x, function(y){ i <- y %in% vetor y[i] <- novo y }) res <-…
ranswered Rui Barradas 15,422 -
4
votes1
answer194
viewsA: For Loop in R - Linear Regression
Running multiple regressions is not as difficult as that. The biggest problem for those who are starting to learn R is in the functions *apply that are cycles for disguised. They greatly simplify…
ranswered Rui Barradas 15,422 -
9
votes7
answers1523
viewsA: How to calculate the median of a line in a date.frame in R?
A solution may be the following. library(dplyr) DADOS %>% rowwise() %>% mutate(Soma = (A + B + C + D + E), Média = Soma/5, Mediana = median(c(A, B, C, D, E))) #Source: local data frame [4 x 9]…
-
6
votes2
answers73
viewsA: Estimate variable of difficult isolation
Using the formula prior to the final formula of reply from Marcelo Shiniti Uchimura, we can adjust a linear model. #ln(ln(razao2) = ln(ln(razao1)) + ln(idade2/idade1)*k log.log.razao1 <-…
ranswered Rui Barradas 15,422 -
3
votes3
answers673
viewsA: How to go through the data.frame cases using `dplyr`?
Here are two ways to do what you ask, one with R base and the other with the package dplyr. First I’m going to redo the data, with set.seed to make results reproducible. And in an easier and more…
ranswered Rui Barradas 15,422 -
6
votes1
answer203
viewsA: Find a variable value for a --- Target function
To calculate the values of c that are solution of the equations cvcalculado(c) == cv, for each value of cv, I will first define another function, the auxiliary function f. This serves to transform…
ranswered Rui Barradas 15,422 -
7
votes4
answers315
viewsA: How to apply several functions to the same object?
The base function R Map can do what you want. First I will redo the data, since I will also use a list of vectors, not just a list of functions. set.seed(123) x <- rnorm(10) y <- x is.na(y)…
-
5
votes2
answers130
viewsA: R - cut digits
Although there is already an accepted answer, here is another way to do the same. You can use arithmetic to keep only the last two digits, just calculate the rest of the division by the power of the…
ranswered Rui Barradas 15,422 -
3
votes1
answer92
viewsA: Occurrence count of a dataframe giving error
This solution uses the package dplyr. library(dplyr) dados %>% group_by(DT, SO) %>% summarise(count = n()) %>% arrange(desc(count)) %>% slice(1:15) ## A tibble: 8 x 3 ## Groups: DT [3] #…
ranswered Rui Barradas 15,422 -
2
votes2
answers49
viewsA: Redeem results in r
Despite the calculations of answer by Tomás Barcellos are right, the object sumario already has the p-values: reg <- lm(mpg ~ wt, mtcars) sumario <- summary(reg) sumario$coefficients[, 4] #…
-
4
votes4
answers4136
viewsA: Count equal values in one data frame and store in another in R
Here are two forms with only R base. as.data.frame(table(total_amostral$TOTAL)) # Var1 Freq #1 10 1 #2 11 1 #3 12 2 #4 13 3 #5 14 5 #6 15 5 #7 16 7 #8 17 7 #9 18 8 #10 19 7 #11 20 7 #12 21 5 #13 22…
ranswered Rui Barradas 15,422 -
4
votes1
answer77
viewsA: How to change a value in a list of R files?
I have written two files with the filenames of the question. The first one has # '001.R' if(M){ print("Script 1") } and similar in the file '002.R'. Note that is not necessary the test M == TRUE…
-
4
votes4
answers145
viewsA: How to count strings from a variable
I will use the data as it is in answer by Tomás Barcellos. A single line of base R code solves the problem. lengths(strsplit(dado[["DS_COMPOSICAO_DA_COLIGACAO"]], "/")) #[1] 4 3 6 4 Now just create…
ranswered Rui Barradas 15,422 -
4
votes2
answers49
viewsA: Extract vectors from a vector set of vector names and merge into a single vector
What either can be done with a single instruction R base. The main function to be used is mget. Then just turn into vector (unlist) nameless (unname). vetor_geral <- unname(unlist(mget(teste)))…
-
4
votes1
answer50
viewsA: How to make a prediction interval for a restricted group?
To predict using the model adjusted with lm, have a dataframe with the regressive variables at the points you want. The code below creates a sub-df with the lines where insulin is in the 1st…
ranswered Rui Barradas 15,422 -
5
votes3
answers488
viewsA: Calculation of Difference between Dates
The base R has functions to make calculations with dates, for simple cases such as difference in days (or other units) no need to load external packages. DADOS$DIFERENÇA <- with(DADOS,…
ranswered Rui Barradas 15,422 -
6
votes3
answers158
viewsA: Minor Date in a Dataset
Two ways with R base. With aggregate. aggregate(abv_data ~ MATRICULA, DADOS, min) # MATRICULA abv_data #1 1 2017-12-10 #2 3 2015-01-01 #3 4 2016-07-02 #4 5 2016-12-03 #5 6 2014-04-13 With tapply.…
ranswered Rui Barradas 15,422