Posts by Rui Barradas • 15,422 points
432 posts
-
0
votes1
answer36
viewsA: How to calculate the Mcloone index?
For starters, the column VAA is in quotes and is therefore read as "character", it must first turn into "numeric". uniao$VAA <- as.numeric(sub(",", ".", uniao$VAA)) To add the column VAA for UF.x…
ranswered Rui Barradas 15,422 -
2
votes1
answer62
viewsA: How to adjust the regression line so that 90% of the lines are below the line?
Here are two ways to solve the problem using the package quantreg. The formula y = a*x^b can be transformed by applying logarithms and adjusting the resulting model, i.e., a robust regression line…
-
0
votes1
answer60
viewsA: How to create numerical samples based on multiple conditions on multiple vectors?
I don’t know if I understand the question but if you want n = 50 random numbers described after the last issue of the question perhaps the following code solves the problem. Calculate the minimum…
-
1
votes1
answer42
viewsA: Mean by line range
The following function calculates averages tipo = "segmentos" m. In the case of the question m = 5, that is, the averages of elements 1-5 are calculated, after 6-10, 11-15, etc. tipo = "movel". The…
ranswered Rui Barradas 15,422 -
1
votes1
answer155
viewsA: Standardize bar width between distinct graphs ggplot2
Note: The following solution may not be what is required. To automatically ensure that the bars are of the same width, I will join the two datasets creating a new column, Data, which says which data…
-
2
votes1
answer80
viewsA: Join columns in ggplot histogram
Here are two ways to do what the question asks. The problem is in grouping the dates by semester. R has the class "Date" but there is no class "Semester" so you have to do it manually. The package…
-
2
votes1
answer597
viewsA: Turn a Row into a Column in R
There are several ways to reformat data from long to wide format. I will use the package tidyverse. library(tidyverse) df1 %>% group_by(Nome) %>% mutate(Grp = row_number()) %>%…
ranswered Rui Barradas 15,422 -
4
votes1
answer219
viewsA: How to read the School Census data in R (Enrollment)
To read the data and filter them at the same time I will use the package sqldf which I think is ideal for this since it allows you to filter the data with instructions SELECT of language SQL. This…
ranswered Rui Barradas 15,422 -
0
votes1
answer49
viewsA: Update worksheet by another worksheet
The following code should solve the question problem. For each line of A, Determines the line of B that has the same Empreendimento. If you found one and only one, refresh the question columns. This…
-
3
votes1
answer43
viewsA: Delete columns that have NA
The following code removes all columns with some NA in them. dados <- dados[!sapply(dados, anyNA)] head(dados) # Nome Nota2 Nota8 #1 4 4 1 #2 4 3 1 #3 2 4 1 #4 2 4 4 #5 1 4 2 #6 4 3 2 Code to…
ranswered Rui Barradas 15,422 -
2
votes1
answer422
viewsA: Group and sum columns - r
The question code is almost there, just include the count n(): library(dplyr) a <- df %>% group_by(P) %>% summarise(Total = sum(Citacoes), Count = n()) a # A tibble: 3 x 3 # P Total Count #…
-
1
votes2
answers175
viewsA: Importing and cleaning several text files in R
If I understood the question (I doubt you have) maybe the following code can solve the problem. The function below uses the Unix/linux command awk to remove duplicate lines of text. Clean files are…
ranswered Rui Barradas 15,422 -
2
votes2
answers87
viewsA: Create a matrix without repeating the values of the date argument
In addition to @neves' very thorough answer, I wanted to draw attention to something that is in the question (my emphasis): I know you can do this by creating a matrix with data = NA and then loop…
-
3
votes1
answer167
viewsA: How to partially disregard NA in R operations with a historical data series?
This response deals with cases where at least one vector element whose mean is to be calculated is NA with na.rm = TRUE. But contrary to what is written in the question, when all the elements are NA…
-
1
votes4
answers264
viewsA: R - How to replace "." (dot) with " (space) in the column names of a Data.Table?
Just use gsub with the following regular expression: "\\.". Like the point "." is a metacharacter, you have to use the escape sequence with the two against bars. nomes <- c("Salário.Janeiro",…
ranswered Rui Barradas 15,422 -
1
votes2
answers304
viewsA: How to filter data from R lines?
This answer uses the package dplyr to filter lines by date. But dates must be class objects "Date" and for this purpose a temporary column is created first, temp_date. library(dplyr) inicio <-…
-
1
votes2
answers70
viewsA: Loop for inside a list with a group argument function
If I understand the question, it is not necessary split, just pass the factor y at each test. t_tests <- lapply(names(df_1)[1:3], function(nms){ pairwise.t.test(df_1[[nms]], df_1[['y']],…
-
2
votes3
answers63
viewsA: Date printed in wrong format
You can do everything in one line of code. And much more readable. There is a method format.Date for class objects "Date", just use it. It also has the advantage of being able to use any date…
-
2
votes2
answers507
viewsA: Zero left on R
You can do this with sprintf: id <- 1:20 sprintf("%02d", id) # [1] "01" "02" "03" "04" "05" "06" "07" "08" "09" "10" "11" "12" #[13] "13" "14" "15" "16" "17" "18" "19" "20"…
ranswered Rui Barradas 15,422 -
1
votes3
answers100
viewsA: How to generate graphics from a file using a loop in R?
To read the function data read.csv2 is more appropriate, as it already has as a column separator the ";". You can plot every year on the same chart with facet_grid or facet_wrap. In this case I will…
-
5
votes1
answer630
viewsA: Do Not Remove Specific Data Frame
The following function does not remove class objects obj.class corresponding to pattern. keepObject <- function(pattern, obj.class, envir = .GlobalEnv){ obj <- ls(envir = envir) obj <-…
ranswered Rui Barradas 15,422 -
3
votes4
answers101
viewsA: Dealing with dates of heterogeneous formats in R
This function solves the problem for vectors where all elements are in one of the two question formats. The function can easily be made more general if necessary. as_POSIXct_especial <-…
-
1
votes1
answer199
viewsA: Create column with conditional values to those in another column
I’ll assume you have a table with a column called codigo with the values of the question. Solution R base. The following code starts by creating a column nova_coluna all with the same value, 'baixo…
ranswered Rui Barradas 15,422 -
0
votes2
answers64
viewsA: SUMMATION IN A COLUMN IN A LINE INTERVAL
Here are two ways to add one column per month. First you have to turn the column Date at a true date. dados$Date <- as.Date(dados$Date, "%m/%d/%Y") Now, add the column INTE_C1 with aggregate.…
ranswered Rui Barradas 15,422 -
2
votes1
answer54
viewsA: Change x scale to range from 0 to 100 but showing decimal variations
You’ll be looking for a chart like this? library(tidyverse) library(scales) y_limites <- range(mt2) - c(2, 0) as.data.frame(mt2) %>% mutate(Autor = row.names(.)) %>% gather(Intervalo,…
-
3
votes1
answer49
viewsA: Average by classes after excluding MIN and MAX
Here are two ways to do what the question asks. The trick is to use range to obtain at once the values of min and of max. tapply(dados$valor, dados$classe, function(x){ mean(x[!x %in% range(x)],…
ranswered Rui Barradas 15,422 -
6
votes1
answer298
viewsA: Compare objects in R
This function compares column by column with identical and has as output a logical vector with the names of the columns of the first dataframe. If a column name exists in the first df but not in the…
-
4
votes2
answers70
viewsA: A: How to create/save a vector using for and Paste?
Instead of creating 3 (or more) vectors in the .GlobalEnv, the best practice is to keep them in a list. Imagine that the values come from a function, such as the function valores. valores <-…
ranswered Rui Barradas 15,422 -
3
votes2
answers2044
viewsA: Transpose a dataframe into R
You can do what the question asks with the xtabs after reformatting the data from wide format to long format. In this solution row.names(df2) gives the dates. df2 <- reshape2::melt(df, id.vars =…
-
4
votes1
answer149
viewsA: Web Scraping on R
I believe the following answers the question. The problem is that table extraction is not automated at all, you need to know how many columns the table has. library(tidyverse) library('rvest') url…
-
1
votes1
answer38
viewsA: How to specify demand in lpSolveAPI in R?
The package lpSolveAPI accepts restrictions of three types, "<=", ">=" and "=". I believe that in the question the demand must be understood with maximum demand, not as fixed demand. Then the…
ranswered Rui Barradas 15,422 -
3
votes2
answers74
viewsA: Separate a datraframe in subdatraframes based on a condition
The easiest way is on a line of code. list2env(split(df_AI, df_AI$Regiao), envir = .GlobalEnv) The split creates a list with a dataframe for each unique value of df_AI$Regiao. And list2env…
-
2
votes2
answers909
viewsA: How to replace variables with NEGATIVE values with ZERO within a date.frame in R?
Here’s another way, with the package dplyr. Utilizes the mutate_if (mutate conditional) to determine which columns are numerical and modifies only those columns. The function neg2zero serves to make…
ranswered Rui Barradas 15,422 -
0
votes3
answers518
viewsA: Conditional Sum on each R line?
I believe that the following function does what the question asks, with the difference that the final result is 60 and not 50, since if the first value of df[1, 3] should then be maintained the…
ranswered Rui Barradas 15,422 -
1
votes2
answers92
viewsA: A - How to display repeat progress ? And processing time?
The next function reads the file arq_grande, filter and write the file arq_out in pieces of size chunk_size. library(dplyr) readCSV <- function(CNJ, arq_grande, arq_out, chunk_size = 5000){ f…
-
5
votes2
answers182
viewsA: How to use filter() to select only a part of the string?
One option is the grepl, that returns TRUE/FALSE whether or not it finds a regular expression. The asterisk shall be preceded by \\ because it’s a metacharacter. filter(df_datasus, grepl('\\*E119',…
ranswered Rui Barradas 15,422 -
3
votes1
answer50
viewsA: Word combination identification in R
The following regular expression does what the question asks. grep("Telefone\\s*[[:alnum:]]+\\s*apagar", teles, ignore.case = TRUE) #[1] 1 4 5 6
-
3
votes2
answers89
viewsA: Rearview Metacharacter x does not take the corresponding groups when I change the order of these
I am going to simplify the code a little bit, since there is no need to call regex, The following is, in the case of the question, equivalent. library(stringr) b <- 'lentamente é mente lenta'…
-
0
votes2
answers649
viewsA: Sum of vector in R
The function funbelow calculates the sum of the question. One of the arguments is FUN, the function of I and of PHI, in the case of the question funcao_f. fun <- function(I, PHI, FUN){ tbl <-…
ranswered Rui Barradas 15,422 -
1
votes1
answer600
viewsA: How to order x-axis that is date in ascending order?
It is better to change the class from column to class "Date", not for class "factor". base_fox2$mês <- as.Date(paste("1", base_fox2$mês, sep = "/"), "%d/%b/%y") After this, if you just want the…
-
1
votes1
answer39
viewsA: Problem while uploading and converting.txt files
The code of this answer has been tested with files that match what we know by the description in the question, it is not guaranteed to work without errors or adaptation. First, if you are going to…
-
1
votes1
answer49
viewsA: How to insert a die in the last row of the list in R?
To make moving averages, it is best to use one of the functions roll* package zoo. In that case I’ll use rollmeanr, where the r final means that moving averages are aligned to the right. This is due…
ranswered Rui Barradas 15,422 -
3
votes1
answer55
viewsA: Create new column in the partial match-based dataframe of the string without repeats
You can use the grepl to give logical indices and then calculate positions in the intended result vector. i <- grepl("Payroll", dados$GLDESC) j <- grepl("Supply", dados$GLDESC) dados$KIND…
-
2
votes2
answers81
viewsA: Difference between dates located in different rows and columns
The following function makes the calculation the question asks, with the differences calculated for each id. Note that date columns have to be class "Date". fun <- function(DF){ f <-…
-
4
votes1
answer61
viewsA: Add order of records according to date and id
In R base, you can do what the question asks with the function ave. It should be noted that the output of ave is of the same class as the first argument, so you must pass the column date as a…
-
1
votes1
answer702
viewsA: How to stack multiple data frames in R?
I’m going to assume that the dataframes have names with something in common, in this case: Begin with "analise"; End in numbers. So a combination of ls/mget can automate the creation of a list to be…
ranswered Rui Barradas 15,422 -
0
votes2
answers509
viewsA: How to create a sequence of dummy variables with loop in r
A base R solution, using the example of the Marcus Nunes, but with set.seed and with the dataframe name changed. set.seed(1234) df1 <- data.frame( empresas = sample(c("GLO", "AZU"), 10, replace =…
-
1
votes1
answer495
viewsA: How to filter lines in R?
Here are three ways to do what the question asks, two on R basis and one with the package dplyr. 1. aggregate. aggregate(Peso ~ Animal, dados, FUN = function(x) x[which.min(abs(x - 120))]) # Animal…
-
2
votes1
answer87
viewsA: Plot of lines in ggplot
To annotate the axes with the data values one can use one of scale_?_discrete whole data or "factor"; scale_?_continuous continuous numerical data. But we must pay attention to the values of breaks…
ranswered Rui Barradas 15,422 -
1
votes1
answer474
viewsA: Bar graph in ggplot in r
As for the first question, just transform Trimestre class vector "factor" with the levels (levels) in the desired order. As for the second question, the function percent has an argument accuracy…