Posts by Carlos Eduardo Lagosta • 5,497 points
162 posts
-
5
votes1
answer53
viewsA: Wrong return when grouping rows from a data frame
Remove the data.test2$ of sum(..., or dplyr will understand that it is to use the total sum in each group. library(dplyr) # Dados de exemplo set.seed(876) dados <- tibble(Curso =…
-
4
votes3
answers702
viewsA: Add elements into a vector using loops and inputs
vetor <- rep(NA, 5) for (indice in seq_along(vetor)) { vetor[indice] <- as.numeric(readline(prompt = 'Numero: ')) } print(vetor) It is good practice in R not to create objects that expand in…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes1
answer103
viewsA: Stacked area chart
As pointed out in the comments, there are two problems in their data: 1) not all dates have values for all countries (implicit ones); 2) more than one value for the same country on the same date.…
-
2
votes2
answers172
viewsA: Error when trying to run nls
The nls is particularly susceptible to poor parameter estimates (or to very distant initial values and/or models with poor data parameterization). An alternative in case you cannot establish…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes2
answers294
viewsA: How to plot distinct regression models using the ggplot2 + ggpmisc or gridExtra packages?
There is no option for each Facet to have a regression type, you have to make two separate graphs. To avoid repetition, you can write a function. To join the graphics, I used the package ggpubr,…
-
0
votes1
answer58
viewsA: I need to join two columns of DATE and TIME with the class Posixct but returns me NA
Nas are generated because of the extra spaces when passing the format to strptime. mycbon <- read.table(text = " Date Time Receiver Transmitter 29/04/2019 05:31:33 134156 4822 29/04/2019 08:52:08…
ranswered Carlos Eduardo Lagosta 5,497 -
0
votes2
answers87
viewsA: Update data and add new data
There is no option for Excel files (as well as for flat data files such as CSV) to update specific lines, just add new ones. The solution is to load the entire saved spreadsheet, delete the lines to…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer152
viewsA: Problem with key in ggplot, using double axis y + geom_line()+geom_point() x geom_bar()
You’re defining color and group in global aesthetics, which will make them apply to all geometries. Put in the global the parameters that apply to all and the aesthetics of the geometries what…
-
3
votes3
answers1427
viewsA: Moving average in R
When calculating moving averages the result will be shorter than the original data: library(zoo) set.seed(42) teste <- sample(1:50) mean <- rollmean(teste, 7, align = "right") >…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer104
viewsA: Loop with dataframes in ordered logistic regression
There are two problems with your loop: He’s doing a regression for each of the data frames., not for the data of the two together. Every loop the contents of the object pom is rewritten, meaning in…
-
0
votes2
answers129
viewsA: Loop Operation with Multiple Data Frames
How are you using data.table, it is neither necessary nor recommended to divide the data into different data.frames; it is better to do operations per group. library(data.table) # Resumo dos dados…
-
2
votes1
answer48
viewsA: Function to create charts per column
For what you want to 1) use the column name (either directly using the name or getting it by the number) and 2) store the column name in an object. You can then use a loop (inside or outside the…
-
2
votes2
answers51
viewsA: Apply function by groups or factors to R
Using data.table: library(data.table) dados <- fread(text = 'ano,key_cd7_ibge_mun,qtd_pop_mun,qtd_pop_est,qtd_pop_pais 2014,1100015,25652,1748531,202768562 2015,1100015,25578,1768204,204450049…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer50
viewsA: Loop to fill data returns only the last values
The way you wrote it, every turn of the loop the name objects, P1, P2 and P3 have their values rewritten. Rotate lenght(nome) (or P1, etc.), you will see that it contains only one value. Create…
ranswered Carlos Eduardo Lagosta 5,497 -
4
votes1
answer150
viewsA: Graph of cases accumulated with ggplot2
Dplyr library(dplyr) library(ggplot2) dados <- read.csv2('arquivo_geral.csv') dados %<>% filter(casosAcumulados > 9) %>% group_by(estado) %>% mutate(diasposdez = 1:n())…
-
1
votes2
answers321
viewsA: How to plot map with place names - ggplot - R
One option is to use the centroids to plot the names. Since you didn’t post your shapefile, I’m using one that I already have for example and simulating some random data: shape <-…
-
2
votes1
answer71
viewsQ: Show only ggpairs upper or lower triangle
The ggpairs function of the Ggally package implements a version of the pairs for ggplot. The chart type displayed in the upper, lower and diagonal triangle of the matrix can be customized with the…
-
2
votes1
answer71
viewsA: Show only ggpairs upper or lower triangle
This solution was given by Richard Telford in the OS in English for the lower triangle, I’m expanding and detailing the answer here. The ggpair creates a list of Plots following a matrix by lines.…
-
1
votes2
answers80
viewsA: Organisation of the x-axis
Var data was read as factor because of the percentage symbol. There is no way to read how Numeric using read.table, you need to load the data and then process the string and convert: dados_ibov…
ranswered Carlos Eduardo Lagosta 5,497 -
0
votes1
answer46
viewsA: Date manipulation using dplyr and lubridate
# Dados de exemplo set.seed(4) dados <- data.frame(Data = rep(paste0('2020-01-0', 1:3), each = 4), Média.Horária = sample(1:20, 12, TRUE)) Using dplyr As pointed out by @Rui-Arradas, when using…
-
0
votes1
answer31
viewsA: Select the most recent date of each group in r
In order for maximum and minimum functions to be applied, you first need to convert your Character/factor dates to a numeric class, such as POSIX: dados <- read.table(text = " data;code…
-
2
votes3
answers339
viewsA: How to convert the number of hours into date format in r
If you only have full hours or half hours, you can use gsub to replace .5 for :30 before converting to POSIX. Write as a function to prevent code repetition when applying to the date.frame. exemplo…
-
0
votes1
answer39
viewsA: How to delete values in a text column from a csv file in R?
dados <- read.table(text = "1 {date:2018-08-01 state:RN store_id:3162633 sale_id:326463633336323 off_product_id:613665646663346 quantity:1 price:229.0 customer_id:null} 2 {date:2018-08-01…
ranswered Carlos Eduardo Lagosta 5,497 -
4
votes1
answer71
viewsA: How to create a heatmap for a calendar?
The package ggTimeSeries has the function ggplot_calendar_heatmap for that reason: library(ggTimeSeries) dados <- data.frame( data = seq(as.Date("1/01/2019", "%d/%m/%Y"), as.Date("31/12/2019",…
-
2
votes2
answers74
viewsA: Separate a datraframe in subdatraframes based on a condition
If I understand your question correctly, you want to create an object for each subset of the data.frame df_AI. If this is the case, you can use the function assign: df_AI <- data.frame(Regiao =…
-
2
votes1
answer46
viewsA: Removal of lines with non-repeating levels in R
> df[df$X %in% comuns, ] X Y 1 a w 2 b w 3 c w 4 a K 5 b K 6 c K 7 a L 8 b L 9 c L 11 a Z 12 b Z 13 c Z Finding the common elements: tabF <- table(df$X, df$Y) comuns <-…
ranswered Carlos Eduardo Lagosta 5,497 -
4
votes3
answers5543
viewsA: Change axes X and Y graphs ggplot in R
# Dados de exemplo set.seed(123) PAbr <- data.frame( Data = seq(as.Date("2018-01-01"), by = "1 month", length.out = 18), Produção = rnorm(18, 10^9, 10^8) ) library(ggplot2) ggplot(PAbr, aes(Data,…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes2
answers81
viewsA: Difference between dates located in different rows and columns
I prefer the Rui Barradas solution because it only needs the base package and follows the functional principle of R, but here is an answer with data.table: library(data.table) setDT(df1) # Converte…
-
0
votes1
answer203
viewsA: Change time series chart scale
Two things: If all you care about is the number of months and not the dates themselves, you don’t have to convert to ts. As a time series, you will not be able to define the intervals using integer…
ranswered Carlos Eduardo Lagosta 5,497 -
0
votes2
answers634
viewsA: How to include a value in the last line of a data.frame in R
You can add values at any positions directly, it is not necessary to create an object just for that: df <- data.frame(x = 1:15, y = 1:15) df[nrow(df)+1, "x"] <- 4 > tail(df) x y 11 11 11 12…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer302
viewsA: Add two different scales in ggplot
This is an alternative and not an answer. ggplot’s philosophy is not to implement features that lead to poor visualizations. This is the case of two Y axes for data of the same type: it is very easy…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes1
answer397
viewsA: How to create charts using time series showing every month on the x-axis in R?
The package ggfortify adds to ggplot the ability to recognize class objects ts, among others. Scale detection by ggplot is usually good (in this case it will display months and years automatically),…
-
4
votes2
answers91
viewsA: Extract information from a string
If you need a date.frame with names and prices: library(magrittr) # para os operadores de fluxo dados <- strsplit(a, "(?<=[0-9] )", perl = TRUE) %>% unlist() %>% strsplit("R\\$") %>%…
-
2
votes1
answer865
viewsA: Separation of rows into columns in R
A trick to select alternate lines is to use a logical vector in indexing, which will be recycled over the entire length of the date frame.: dados.brutos <- structure(list(key = c("pais",…
ranswered Carlos Eduardo Lagosta 5,497 -
4
votes3
answers1493
viewsA: How to number lines of a data.frame in R?
DADOS <- read.table(text = 'letra N1 N2 N3 N4 A 2 3 4 4 A 1 2 3 4 A 2 2 1 3 B 0 1 2 0 C 4 4 3 2 C 2 2 2 2 D 4 3 2 1 D 1 0 1 4 E 4 4 4 4', header = TRUE) DADOS$numeracao <- 1:nrow(DADOS) >…
ranswered Carlos Eduardo Lagosta 5,497 -
1
votes2
answers299
viewsA: R how to extract the first value from a date list.
R is a functional language, the extraction/indexing operator can be used as a function. So, if base_list[[1]][1] extracts the first value of the first item from the list, to extract the first item…
-
1
votes1
answer69
viewsA: Combine file . shp with a data frame
You need a spatial location aggregation. The sf package has the function st_join for that reason. As I do not have your data I am using a shapefile of the districts of the city of São Paulo that…
ranswered Carlos Eduardo Lagosta 5,497 -
3
votes3
answers1276
viewsA: A: Creating a new variable using if and Else
Even more an alternative. Create the new variable filled with 'INTERNATIONAL' and then change only the lines you want using basic indexing: base <- data.frame( iso_pais = c( '076', '840',…
-
5
votes2
answers75
viewsA: Split Column into Other Two Lat Long
You can use strsplit to separate the coordinates, taking advantage of the "look back" option of regular expressions. Because strsplit returns a list, the most practical is to first throw the result…
-
5
votes3
answers397
viewsA: How to categorize values in a Data Frame in R?
If you want to stick to the base package, you can use indexing and multiple comparisons: set.seed(123) dados <- data.frame( Municipio = LETTERS[1:6], IVS = runif(6) ) dados$IVScat[dados$IVS <…
-
4
votes1
answer543
viewsA: Axis x minute by minute
First an example with a time line longer than five minutes: set.seed(321) CI <- data.frame( Time = as.POSIXct(sort(sample(1541062677:1541063688, 10)), origin = '1970-01-01 00:00.00 UTC'), CIDM =…
ranswered Carlos Eduardo Lagosta 5,497 -
7
votes1
answer452
viewsA: bar graph ggplot 2 vectors side by side
You can manually shift the position of any layer using position_nudge. You just need to adjust the width of the bars as well: ggplot(a, aes(x = year)) + geom_bar( aes(y = v1), stat = "identity",…
-
2
votes2
answers144
viewsA: cycle for calculation of a function
Your code is doing exactly what you specified. You provided a value of x and a value of y, and that was the result returned: > 0^0+0 [1] 1 To use for, you must specify a sequence, for example: x…
ranswered Carlos Eduardo Lagosta 5,497 -
2
votes1
answer146
viewsA: Manipulation of CSV in R
The data.table::fread function is an optimized version of read.table. The Skip option allows you to include a string that marks the beginning of the file. The function is also quite efficient in…
-
6
votes1
answer349
viewsA: R corrplot - coloring based on correlation values
The problem is in how the option col, when custom pallets are used, works together with cl.lim. The package documentation talks about it. See what happens with and without cl.lim. I am using the…
ranswered Carlos Eduardo Lagosta 5,497 -
8
votes2
answers4121
viewsA: R - Map of Brazilian cities
The latest versions of ggplot2 have map-specific geometry. The great advantage is that you don’t need to merge the data.frame with the data with the spatial object (but for that the spatial object…
-
4
votes1
answer56
viewsA: Notification from one date to another R
I don’t deal with Rstudio and Shiny, I don’t know what automation would be like within what you’re developing, but here’s how you can do the math. I modified your sample data to have more than one…
-
4
votes1
answer200
viewsA: Multiple Linear Regression in R
dados <- read.table('IVCM.txt', header = TRUE) regLin <- lm(IVCM ~ TRAT * RE, dados) # o asterisco na fórmula indica que é para calcular também a interação # você pode usar "+" no lugar se…
-
5
votes2
answers639
viewsA: How to split the dataframes of a list based on a group variable, common in all of them?
Applying subset multi-criteria: lapply(mylist, subset, (group == 'a' | group == 'c') & number > 40) [[1]] number group 2 40.39104 a 7 47.56538 c 8 43.14062 c 9 47.42608 c [[2]] number group 1…
-
5
votes4
answers7118
viewsA: Changing the name of a variable in a dataframe R
Using grep to find the column number you want to rename: dados <- data.frame( 'Year' = 2015:2018, 'Country' = 'Brazil', 'Continent' = 'America' ) names(dados)[grep('Country', names(dados))] <-…
ranswered Carlos Eduardo Lagosta 5,497