Posts by Rui Barradas • 15,422 points
432 posts
-
2
votes1
answer46
viewsA: How do I build the Summary() function in R for a data frame?
Here’s a possible function resumo_df. First check whether the argument X is a data.frame and gives error if it is not; Next, determine which numerical columns; And for each numerical column calls…
-
4
votes1
answer40
viewsA: How do I do a repeat structure function on R?
Here are five ways to solve the problem. 1. With cycle while H1 <- function(n){ numer <- 2L denom <- 3L termos <- 0L total <- 0L while(termos < n){ total <- total + numer/denom…
-
0
votes1
answer28
viewsA: Sum of 2 date.frame columns
You have to convert to numeric before adding. But first see with str(educa[3:4]) the class of columns: If the columns of educa[3:4] are of class "factor" educa[3:4] <- lapply(educa[3:4],…
-
3
votes1
answer49
viewsA: function summarise
The following code counts the medals that Brazil had at the Olympic Games. library(dplyr) library(readr) library(ggplot2) fl <- list.files(pattern = 'athlete.*\\.csv$') fl cols_spec <- cols(…
ranswered Rui Barradas 15,422 -
2
votes2
answers22
viewsA: How to save content from data sets to files dynamically?
dsets is a string array, not a date list.. To access the data.frames you have to retrieve them with get. In addition, it is not necessary to compose file names one by one, the paste is a vector…
ranswered Rui Barradas 15,422 -
5
votes1
answer32
viewsA: Create conditional with counter in a column
The problem can be solved with a trick diff/cumsum, that gives vector segments a which comply with the condition, followed by ave/seq_along, to give integer sequences. And to put zeros in the right…
ranswered Rui Barradas 15,422 -
1
votes2
answers35
viewsA: How to export [[ ]] from a list in separate excel files using R?
Here are three different modes, with three different packages, of exporting a list of data.frames to an Excel file ".xlsx" with each date.frame on a sheet. The list is for the question. The split…
-
5
votes1
answer65
viewsA: half-normal waste graph Plots no ggplot2
According to the section Value of help("hnp), the object Graph16 is a class list "hnp". These objects have members x, lower, upper, median and residuals, which are the vectors represented in the…
-
2
votes2
answers53
viewsA: How to count repetitions in a range using length(which()) in R
Maybe the following function can solve the problem. Accepts two strings in question format and creates an array of numbers in sequence. rangeAlphanum <- function(x, y){ xchar <-…
ranswered Rui Barradas 15,422 -
3
votes2
answers55
viewsA: Transform dataframe from long format to wide
I believe that If you remove the column valor before reformatting to wide; pass id_cols = c(ano, estado) at the pivot_wider the problem is solved. dados %>% mutate(ano = year(data)) %>%…
-
3
votes1
answer26
viewsA: Turns an observation into a variable in R
To reformat data from long to wide format, you can use the tidyverse: library(dplyr) library(tidyr) dados %>% pivot_wider( id_cols = c(Year, Month), names_from = Subcategory, values_from = Value…
ranswered Rui Barradas 15,422 -
0
votes1
answer60
viewsA: How to save a pdf of the statistical report generated with Expdes.pt package?
The best way is with the package rmarkdown. See also R Markdown. Record a text file so_pt.Rmd (or otherwise, keep the extension) with the following content. --- title: "so_pt" author: "Rui Barradas"…
-
2
votes2
answers62
viewsA: Perform a previous values calculation in column on R
Here is a solution in R base, vectorized with sign. n <- nrow(dados) inx <- seq_len(n)[-(1:2)] dados$col2 <- 1 dados$col2[inx] <- sign(dados$col1[inx] - 1.02*dados$col1[inx - 2]) Test…
-
1
votes1
answer23
viewsA: Complete values of a column with values of the column itself in date.table
Here’s a solution. Use a function f to calculate the new value of pib_per_capta (missing a i in capita). f <- function(x){ y <- x[["pib_per_capta"]] pib <- tail(y, n = 3)[-3] new_pib <-…
ranswered Rui Barradas 15,422 -
1
votes1
answer82
viewsA: With inserting texts into a scatter chart (with ggplot)?
Maybe the following will solve the problem. A data.frame is first created with only the line corresponding to the maximum value of x. Then a color vector for each line of this df. E no geom_text,…
-
2
votes2
answers62
viewsA: Mixed effects model residue plots using ggplot2
As a complement to the excellent answer by Jorge Mendes, this response separates the waste by levels of DummyVariable on the chart p1b. This is done with the aes(group = DummyVariable). The second…
-
3
votes1
answer30
viewsA: Graph error with package ggplot2 and function sumarySE
You are misreading the file, a correct way is as follows: google_id <- "1X7FEhxjxAVBD-9LB6UrRUd2aMtjcwnyR" google_file <- sprintf("https://docs.google.com/uc?id=%s&export=download",…
-
3
votes1
answer25
viewsA: Problem using spread()
The function spread is notoriously difficult to use. Hence the package tidyr now recommends pivot_wider. Of help("spread"): Development on spread() is complete, and for new code we recommend…
-
1
votes2
answers118
viewsA: Logarithmic scale - Histogram R
You can define a function that calls first hist with plot = FALSE in order to obtain the count vector. If counts == 0 they are given the value NA not to appear on the chart. So R does not compute…
ranswered Rui Barradas 15,422 -
2
votes1
answer22
viewsA: How to turn part of a column into another with data.table?
The following solution uses tstrsplit, a combination of transpose and of strsplit. But before separating the column into two, replace the first space with a "_", since this character cannot be code…
ranswered Rui Barradas 15,422 -
2
votes1
answer28
viewsA: Overlay the legend of the estimated lines using the stat_poly_eq function
The method for not overlapping equations is to use the arguments label.x.npc for alignment on the x-axis; label.y.npc for alignment on the y-axis. The latter is what needs to be viewed carefully.…
-
2
votes1
answer37
viewsA: Adjusted regression line considering different factors in ggplot2
First, to read the data is better, simpler, use the function help("read.csv"), which is the version of read.table for CSV files with sep = ";" and "dec = ",". In CSV files there are always column…
-
1
votes1
answer29
viewsA: How to convert the format of a Date vector without it changing to Character?
No need to convert the format before plotting the chart, the package ggplot2 recognizes class objects "Date" and using the arguments date_breaks and date_labels of scale_x_date the desired format is…
ranswered Rui Barradas 15,422 -
2
votes1
answer39
viewsA: how to delete a column of the "list" class in R
The problem can be solved in R base or with the package dplyr, which is what the question seems to want. I will use this example base: dados <- data.frame(a = 1:3, b = I(list(1,1:2,1:3)), c =…
ranswered Rui Barradas 15,422 -
2
votes2
answers70
viewsA: Error plotting with ggplot
One solution is to use the function tidyr::complete. df %>% complete(ano = 2007:2018, fill = list(n = 0)) %>% ggplot(aes(ano, n))+ geom_line()…
-
1
votes2
answers64
viewsA: Export Summary() as data-frame
Here’s a fairly simple way. dados1_summary <- lapply(na.omit(dados1), summary) dados1_summary <- suppressWarnings(do.call(cbind, dados1_summary)) dados1_summary <-…
ranswered Rui Barradas 15,422 -
5
votes2
answers51
viewsA: How to manipulate two data sets at the same time?
Is not possible manipulate two variables at the same time as it says in the question. But one can manipulate one variable at a time in a simpler way. The best way is to define functions that…
ranswered Rui Barradas 15,422 -
3
votes3
answers95
viewsA: Date sequence from a range in R
A solution in R base may be as follows. saida <- apply(base, 1, function(x) { x <- unname(x) cbind.data.frame( ID = x[1], DATA = seq(as.Date(x[2]), as.Date(x[3]), by = "1 day") ) }) saida…
-
5
votes1
answer56
viewsA: Error when colorizing bar graph and caption with the pal.bands function
First, load the necessary packages and read the data but this time I will read with read.csv2, since it already has the values of header = TRUE, dec = "," and sep = ";". library(RColorBrewer)…
-
5
votes2
answers47
viewsA: Increase the number of columns in the histogram
The number of histogram columns of the base R function hist is given by the argument breaks but it is necessary to be careful, since with the question data the two instructions below give the same…
ranswered Rui Barradas 15,422 -
6
votes2
answers50
viewsA: How to relate a column to a dictionary in R?
After the job you’ve had, the easiest way should be with merge, not forgetting that the columns to be matched have different names. merge(df2, corpo_programa, by.x = "autores", by.y = "nome") #…
-
1
votes1
answer60
viewsA: Apply a function varying between categories in R
A function is first defined to calculate elasticity through the linear model. To calculate differences, each vector must have at least two elements, if they do not have the result is undetermined…
-
1
votes3
answers71
viewsA: Conditional column based on multiple dplyr lines
Although the question asks for a solution dplyr, here is a solution R base, in a row, with the function ave. df$output <- ave(df$car1, df$id, FUN = function(x) if(all(x == x[1])) x else "nd")…
-
4
votes1
answer88
viewsA: Graph of trends in ggplot2
This type of problem usually has to do with data remodeling. The format should be long and the data in broad format. See est post do SO em inglês on how to reshape data from wide format to long…
-
6
votes1
answer79
viewsA: Calculate percentage with dplyr::add_count
After the code with count, make a join to add the remaining columns. dt %>% count(Sex, Pclass, sort = TRUE) %>% mutate(perc = n/sum(n)*100) %>% inner_join(dt, .)…
-
3
votes4
answers150
viewsA: Calculate percentage of an item in a group per year in R
Here is a relatively simple solution. library(dplyr) variaveis <- c("Carvão mineral", "Minerais não-metálicos", "Minério de ferro", "Minerais metálicos não-ferrosos") dados %>% filter(Item…
-
-1
votes2
answers170
viewsA: How to put lines on a chart and bar?
The main problem seems to be the vector Valores be a class vector "character", the numbers are strings, without any numerical value. This can be seen by the yy axis, whose annotations are…
ranswered Rui Barradas 15,422 -
1
votes2
answers24
viewsA: How do I return to "double" from the original df?
First see which columns are class "factor". These will be the columns to transform. str(df) #'data.frame': 6 obs. of 5 variables: # $ fixed.acidity : Factor w/ 3 levels "7.4","7.8","11.2": 1 2 2 3 1…
ranswered Rui Barradas 15,422 -
6
votes1
answer42
viewsA: Notice message by adding year, month and day with the ymd() function of the lubridate
Just change the paste0 for paste with the argument sep = "-". library(hflights) library(dplyr) library(lubridate) hflights %>% mutate(dt = ymd(paste(Year, Month, DayofMonth, sep = "-"))) %>%…
-
3
votes2
answers190
viewsA: Label data in column charts at "Dodge" position in R
Can solve the problem with For aes(Ano, n) in the initial call to ggplot. This simplifies the geom_* since they share the values of the axes. Set the sliders grouping variable also at the beginning.…
-
3
votes1
answer68
viewsA: How to store the results of a function in a data.frame?
I believe the following answers the question. Instead of a cycle for, cycle lapply with the names of the answer variables. Then the most difficult is to have column names with special characters.…
-
3
votes2
answers67
viewsA: How to group by text [R]
With a regex can be done in a line of code. df$coluna2 <- sub(".*\\b([^[:space:]]+$)", "\\1", df$coluna1) Explanation of the regex. The catch group ([^[:space:]]+$) denies (^) the class space and…
-
0
votes2
answers38
viewsA: Reorder Columns in graph made by ggplot
One way is just to turn into factor after reformatting to long format. In the following code I will use the tidyr::pivot_longer and do everything in the same pipe. library(dplyr) library(tidyr)…
-
-1
votes2
answers104
viewsA: How to identify the page number of a . pdf by something written on it?
I believe that the following function divides the input pdf into pages, storing each page in a file and renaming those files with the names of the csv file. The function input is file - the name of…
-
3
votes2
answers57
viewsA: Is there a "dd-mmm-yyyy" format (e.g., "13-feb-1980") in the R?
Data from reply user name @neves. In a baseline R, see the formats here. format(as.Date(df$dt), "%d-%b-%Y") #[1] "13-fev-1980" "03-ago-1983" If the column is already class "Date", see at the end of…
ranswered Rui Barradas 15,422 -
5
votes1
answer121
viewsA: How to change the colors of geom_points in R
To color according to the months, you should use mes in aes(color = mes), not the date dt, as it is in the question code. Note that despite the gather function and be the form used in the past, the…
-
2
votes1
answer28
viewsA: Why does R have multiple folders for Libraries?
Prior note: All translations Deepl Translate and Google Translate, edits by me. The explanation can be found in R itself but first see this user response Dirk Eddelbuettel dated July 17, 2016 to a…
-
2
votes2
answers126
viewsA: Geom_area with different filling colors
This solution is entirely based on this post by Yarnabrina of Rstudio which in turn is based, with credits, on a reply @Henrik from Stackoveflow in English. The principle is very simple, finding the…
-
1
votes1
answer26
viewsA: How to consist of a data frame against a valid value array?
No explicit loop, loop is required *apply is simpler. result1 <- sapply(DF1, function(x) x %in% check) result1 <- as.data.frame(result1) result1 # V1 V2 V3 V4 #1 FALSE FALSE FALSE TRUE #2 TRUE…
-
7
votes1
answer101
viewsA: How to globally save the output of a function in R?
The principle to follow is this: In R the functions return the result of last instruction. So to return a variable, just put it alone on the last instruction. construcaoSudoku <- function(){…