Posts by Carlos Eduardo Lagosta • 5,497 points
162 posts
-
1
votes2
answers46
viewsA: Multiple Moving Medium Columns
The question is tagged dplyr, but here’s an alternative using data table.: library(data.table) dt <- as.data.table(x) dt[, `:=`(spread_3 = frollmean(x, 3), spread_5 = frollmean(x, 5))] > dt x…
-
4
votes2
answers53
viewsA: How to count repetitions in a range using length(which()) in R
A1:A2 will not work because it is dealing with text string. As suggested by @Rui-Barradas in comments, you can mount the string using paste: length(which(exemplo %in% paste0("A", 1:4))) #> [1]…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer36
viewsA: Change the language of the result of a web-scraping with rvest from the IMDB site
Can use httr::add_headers to specify the desired language: library(rvest) library(httr) library(dplyr) imdb <- paste0(imdb_url, "/textlist") %>% html_session(add_headers("Accept-Language" =…
-
6
votes1
answer45
viewsA: Merge columns into a single string in R
Can use apply to apply paste to the selected lines (using collapse instead of sep): df$junto <- apply(df[, 1:5], 1, paste, collapse = "") > head(df) V1 V2 V3 V4 V5 junto 1 9 9 9 9 9 99999 2 1…
-
2
votes2
answers55
viewsA: Transform dataframe from long format to wide
If you want to make a multi-column summary, you do not need to convert from long to long and then go back to long; just apply the same calculation to all selected columns. colSel <-…
-
2
votes1
answer41
viewsA: R - how to use . xlsx and . csv files with onedrive url
Enter Onedrive and in the file context menu take the "Embed" option. It will generate something like: <iframe…
-
1
votes2
answers81
viewsA: format data from a column in a data.frame in R
You can use the function sprintf, indicating the desired format: dados <- c(1, 12, 123, 1234, 12345) > sprintf("%04d", dados) [1] "0001" "0012" "0123" "1234" "12345" Note that the number of…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer26
viewsA: Date Below the chart
See help for barplot, it has an argument to provide the bar names: # Dados de exemplo set.seed(757) dados <- data.frame(date = paste(2020, 1:8, sep = "-"), cases = rpois(8, 4)) with(dados,…
-
2
votes2
answers64
viewsA: Select columns from a base without having to read the whole file
Complementing the reply from @lmonferrari. As the microdata files are very large, I will use the example mtcars set: write.csv(mtcars[1:3,], "exemplo.csv") data table. The function fread package…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes2
answers118
viewsA: Logarithmic scale - Histogram R
Histogram is a bar graph with the element count by range. You can calculate the information with the function hist and use barplot to generate the chart with adjustments: histPrec <-…
ranswered Carlos Eduardo Lagosta 5,497 -
0
votes2
answers39
viewsA: Graphic editing in ggplot2
Like answered by Marcus Nunes, if the new terms ("est" and "Obs") will be used continuously in your workflow, the most practical is to change them directly in the dataset. In cases where this is not…
-
2
votes2
answers51
viewsA: How to manipulate two data sets at the same time?
Like answered by @Macros-Unes, it is not possible to manipulate two variables at the same time; the best practice in R, if you have any procedure that will repeat several times, is to write a…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes1
answer155
viewsA: Converting a list of lists of vectors into a list of matrices - R
If the structure of all list elements is the same, i.e., they all have the same names and lengths, you can put them in the same arrangement and then separate them by the number of variables and…
-
3
votes1
answer89
viewsA: change the distance between the bar and the Y axis in r
The option expand of scales exists for this. Can be used with the function expansion for greater control: library(ggplot2) p <- ggplot(df, aes(j, reorder(k, -j))) + geom_col(fill="#70A2E7", width…
-
6
votes1
answer85
viewsA: Scatter plot in ggplot2
To use different variables in ggplot, your data needs to be in long format. There are several options for this: dados <- read.table(…
-
4
votes2
answers108
viewsA: Automate column subtraction in R
The function diff calculates difference between elements. Function apply applies a function to a dimension. In case it applies to the lines of the data frame., it will return a transposed matrix,…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes3
answers95
viewsA: Date sequence from a range in R
Similar to reply from @lmonferrari, but using data.table: library(data.table) setDT(base) # estabelece como data.table > base[, .(DATA = seq(as.IDate(data_inicio), as.IDate(data_fim), by = "1…
-
5
votes1
answer50
viewsA: Create R column from the number of records by 2 Ids
One solution: number lines by category and convert numbers to types. Packages dplyr and data.table facilitate operation by categories. Conversion can be done with a dictionary. # Dicionário tipos…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes1
answer41
viewsA: Grid on R does not work
The argument panel.first is evaluated lazily ("Lazy"), which brings some limitations. One is that it does not work when using formula to specify the graph. The simplest solution to have the grid in…
-
5
votes1
answer60
viewsA: How do I create a list to save countless Numbers generated by my function in R?
I’m going to use a generic example, so the answer is useful to as many users as possible. In case, generate a list with the Plots box of each column of a data frame.. # Dados de exemplo dados <-…
-
6
votes1
answer82
viewsA: What are the differences between Rmarkdown and Rnotebook?
Rnotebook uses R Markdown, which is an extension of the Markdown markdown markup language with support for embedding blocks of R code. That is, in any case, it is the same file format, with the same…
-
1
votes1
answer66
viewsA: Problems with the lme function with nested variables (any(notIntX <- ! apply(X, 2, const))
You are not specifying categorical variables in the appropriate way. In nlme:lme they stay out of the formula, in the option random. In the lme4:lmer they go directly in the formula, but bounded by…
-
1
votes1
answer39
viewsA: How do I interact specific values of a dataframe with a function?
Use indexing to locate values that match certain criteria: df[df$Idade == 2, "AT.83M"] # ou with(df, AT.83M[Idade == 2]) Store the value in an object or apply the function directly: >…
-
2
votes3
answers71
viewsA: Conditional column based on multiple dplyr lines
@lmonferrari has already responded how to use the dplyr::case_when in your case; this is another option, using ifelse and unique: library(dplyr) df %>% group_by(id) %>% mutate(output =…
-
6
votes1
answer78
viewsA: How do I distribute the content of a column to other columns in R?
As the information is grouped at regular intervals, you can use conditional indexing: df2 <- data.frame( numero = df$Coluna[c(TRUE, FALSE, FALSE)], periodico = df$Coluna[c(FALSE, TRUE, FALSE)],…
-
5
votes1
answer349
viewsA: Average and median boxplot legend in ggplot2 function
To legend only works with plot, hence the error. ggplot2 uses functions and syntax of its own, and has the principle not to allow (or make it very difficult) any action that leads to potentially…
-
2
votes3
answers49
viewsA: Normality and fragmentation of the sample
Similar to reply from Marcus Nunes, but with data.table: library(data.table) setDT(df_1) > df_1[, shapiro.test(x), by = y] y statistic p.value method data.name 1: 2 0.8751273 0.04955792…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes4
answers150
viewsA: Calculate percentage of an item in a group per year in R
You need to enter the column names for the mutate_at. In addition, you must group only by Activity (or the total for the percentage calculation will be given for each item of each activity).…
-
3
votes4
answers150
viewsA: Calculate percentage of an item in a group per year in R
The question is tagged dplyr, but for the record, here’s how to calculate the percentages per group for all numeric columns using data.table: library(data.table) setDT(dados) percentage <-…
-
2
votes1
answer69
viewsA: I need to attend the Anvisa Gui 10 that asks for a specific weighting
Of ANVISA Guide 10, page 9: If the individual points are marked by (X1, Y1), (x2, Y2), (X3, Y3)... (xi, yi)... (Xn, yn), the corresponding standard deviations are s1, s2, s3 ... si ... sn.…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer91
viewsA: transform named list to dataframe
As you want the hexa quoted codes, you need a bit of string manipulation: dt <- data.frame( palette = names(mylist), hex = Reduce(rbind, sub("$", "'", sub("^", "'", lapply(mylist, paste, collapse…
-
2
votes2
answers51
viewsA: How to find out how many times a certain level is repeated in a specific column (factor) of a data.frame?
Using only R base: place the data frames. in a list and apply table to her: # Dados de exemplo set.seed(87365) W <- data.frame(Coluna1 = LETTERS[1:20], Coluna2 = as.factor(sample(1:3, 20, TRUE)))…
-
2
votes1
answer56
viewsA: Wrong numbering of data frame lines in R
The numbering is not wrong because what you see is not the line number, but the name. Consider this example: set.seed(56) exdf <- data.frame(letra = sample(LETTERS[1:4], 4), numero = 1:4) >…
ranswered Carlos Eduardo Lagosta 5,497 -
5
votes2
answers51
viewsA: Special character removal in Software R
The sub/gsub function of the base: data$Rodada <- gsub("ª", "", data$Rodada) Or, using the syntax of data table. : data[, Rodada := gsub("ª", "", Rodada)] You can also remove the "ROUND" and…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer47
viewsA: Successfully recorded bank warning - R
Here’s how to generate a simple log file, which records the date and time and any possible error message: logf <- file("exemplo.log", open = "a") writeLines(as.character(Sys.time()), logf)…
-
2
votes2
answers428
viewsA: How to return only repeated values in R?
Rui Barradas has already responded in the comments, I will expand. To facilitate the visualization and make the answer more general, I will use simulated data: set.seed(736) letras <-…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer63
viewsA: How to obtain the R² and plot curve equation from the object created by the Mostest function
Since your question does not specifically depend on your measurements, I will use generic simulated data to facilitate playback by other users: set.seed(86) x <- sample(1:20, 100, replace = TRUE)…
ranswered Carlos Eduardo Lagosta 5,497 -
0
votes3
answers3287
viewsA: How to put the regression equation on a graph?
To be registered one more option: the function ggpubr::stat_regline_equation. The syntax is the same as ggpmisc::stat_poly_eq: library(ggplot2) library(ggpubr) ggplot(dados, aes(x, y)) +…
-
3
votes1
answer62
viewsA: Divide bibliographic references into columns in R
References follow a pattern, only not size. The general format is: SURNAME, FIRST NAME; SURNAME, FIRST NAME . Title of the article. TITLE OF THE CALENDAR, v. X, p. X-X, YEAR. As fields are delimited…
-
1
votes2
answers126
viewsA: Geom_area with different filling colors
Similar to reply by Rui Barradas, but first detecting lines followed by intersection and then determining the x-axis value using linear interpolation. I’m using geom_ribbon instead of geom_area as…
-
1
votes1
answer226
viewsA: How to break caption text in R
Here’s a simple function to break the line into as many pieces as you want (two by default): div.texto <- function(string, n = 2) { comp <- (nchar(string)/n)*1.2 paste(strwrap(string, comp),…
-
4
votes1
answer152
viewsA: How to transform each digit of a number into an element of a vector in R?
Can convert to character and use strsplit, converting then back to number: numero <- 516481*10^9 algarismos <- as.integer(strsplit(as.character(format(numero, scientific = FALSE)), "")[[1]])…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes2
answers83
viewsA: Reference one color column based on another
As pointed out by @Rui-Arradas in the comments, you can use setNames to associate the color names with the hexa code, you only need to do this as a list and ensure that the names are associated with…
-
5
votes2
answers115
viewsA: Regression Graph in R or Python
The lines in the graph you want to reproduce do not correspond to the result of a linear regression, but simply to the percentages of the value of x. You can add them with stat_function. If you are…
-
1
votes3
answers95
viewsA: Remove data frame row with string divided into multiple columns
You can use the option nrows to read only to the next to last line. Thus the data will be loaded in the appropriate format and does not need to pre-process the file. Only you must first determine…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes2
answers84
viewsA: Warning message: Those produced when calculating confidence intervals
Rui Barradas already gave a good answer on how to calculate the confidence interval, this is to understand what was doing wrong. Because it is not only a programming error, but understanding of the…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer83
viewsA: Error while scaling on ggplot map
How is cutting the area to be plotted and indicating the complete set of data in the scalebar, in addition to the size of this being larger than the displayed area, it receives the error of "out of…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes2
answers81
viewsA: How to join more than two dataframes in R?
To be recorded: the same can be done only with functions of groundwork: dados <- Reduce(function(a, b) merge(a, b, all = TRUE), list(x, y, z)) merge unites two data frames by an identification…
-
3
votes1
answer42
viewsA: Can anyone tell me if you know any R packages to use with the quasi-beta distribution family in this model?
It is not possible to give a proper response without knowing how your data is. But at first you can use glm with the family "quasi" and specifying the variance function to be the same as in Beta:…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes2
answers72
viewsA: How to use the rm function without erasing everything, leaving only one or two vectors?
To remove all but one object: rm(list = ls()[-a]) To remove all but some objects: rm(list=ls()[!ls() %in% c("a", "b")]) # ou rm(list = setdiff(ls(), c("a", "b")))…
ranswered Carlos Eduardo Lagosta 5,497