Most voted "dplyr" questions
dplyr is an R package dedicated to data manipulation.
Learn more…124 questions
Sort by count of
-
22
votes5
answers12797
viewsHow to consolidate (aggregate or group) the values in a database?
Suppose I have the following database vendas<-c(100,140,200,300,20,1000,200,3000) vendedor<-c("A","B","A","B","C","C","D","A")…
-
13
votes4
answers1143
viewsAdd rownames as column using dplyr
I would like to do something that is quite simple using the common R syntax, but using the package dplyr. The task is basically to add the row.names of an object data.frame as column on that same…
-
9
votes3
answers1163
viewsHow to separate a string from a certain line of a data.frame and at the same time create more rows?
I have a data.frame with a column with strings like "123-235-203". For example, the line: string column1 column2 123-235-203 x y I want to separate this string so that the row that contains it…
-
8
votes1
answer748
viewsfilter in dplyr using a categorical variable
Suppose I have the following data set: set.seed(12) dados <- data.frame(grupos=rep(letters[1:5], 5), valores=rnorm(25)) head(dados) grupos valores 1 a -1.8323176 2 b -0.0560389 3 c 0.6692396 4 d…
-
8
votes1
answer1670
viewsBar graph sorted using dplyr and ggplot2
I would like to create a bar chart after counting the number of occurrences of the categories of a data set. Suppose my dataset is this below: dados <- structure(list(categorias = structure(c(5L,…
-
7
votes1
answer143
viewsBuilding new variables using dplyr
I have the following database Clientes.Dep..Gratuito.PCG Clientes.Dep..Gratuito Clientes.Dep..Não.Gratuito 0 0 0 0 0 0 25 0 0 0 0 2 0 0 79 0 0 71 Clientes.Usu..Gratuito.PCG Clientes.Usu..Gratuito…
-
7
votes2
answers292
viewsCumulative count of group occurrences on dates
I have a data set similar to the one below. It has a column with dates and another with occurrences of groups on these dates. data grupo 1 2019-01-01 a 2 2019-01-01 a 3 2019-01-01 a 4 2019-01-01 a 5…
-
5
votes1
answer259
viewsSelect first lines depending on group efficiently
Suppose I have the following database set.seed(100) base <- expand.grid(grupo = c("a", "b", "c", "d"), score = runif(100)) And that I want to select the lines with smaller score depending on the…
-
5
votes2
answers256
viewsSort the k highest results using dplyr
I can select the k greatest results from a table in R. For example, if k equals 5, I get the following result: library(dplyr) library(ggplot2) top_n(mpg, 5, wt=displ) # A tibble: 5 × 11 manufacturer…
-
5
votes1
answer85
viewsCode improvement
Good afternoon. I have the following data structure: structure(list(CIDADE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label =…
-
5
votes2
answers899
viewsRenaming the levels of a factor based on a data frame
Suppose I have the date frame iris, present in the memory of R: head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3…
-
5
votes1
answer109
views -
5
votes1
answer45
viewsRandomizing two sets of numbers, not repeating the values within each group (R)
Whereas I have these individuals on file: ID 1 1 1 3 3 3 7 7 7 And I need to assign two sets of numbers to ID randomly (set1 - 1,2,3; set2 - 5,15,25). To do this my attempt was: df %>%…
-
4
votes2
answers823
viewsDplyr and gsub: how to replace parts of one column with another
I have the following data-frame: xis <- data.frame(x1=c("**alo.123", "**alo.132", "**alo.199"), x2=c("sp", "mg", "rj"), x3=c(NA)) I would like to create a new column, using gsub as follows: x3[1]…
-
4
votes2
answers1551
viewsHow to group data by an id in R
I have the following database: id x 1 2 1 3 2 3 3 3 3 3 3 3 I wanted to create a new database without repeating the field value id, to solve this I can average the field values x that belongs to the…
-
4
votes2
answers175
viewsApply function in data groups
I need to separate the data into groups and perform the calculations in two or three groups/dimensions. I found the tapply function, it solves the problem. With it I get what I need by using the…
-
4
votes1
answer650
viewsInteraction graph in ggplot2
I’m trying to adapt some standard R graphics to the style of ggplot2. One of the graphs for which I intend to do this is the interaction graph in a linear model adjustment study. The following data…
-
4
votes2
answers133
viewsDirect (and beautiful) solution to fix base using dplyr
I have the following basis of defaulters: df <- data.frame( lead_15 = c(1,0,0,0,0,1,0,0,1,0,0,0,0,0,1), lead_30 = c(0,0,0,1,0,0,1,1,0,1,0,0,0,1,0), lead_60 = c(0,1,0,0,1,0,0,0,0,0,1,1,0,0,0),…
-
4
votes1
answer59
viewsReorganize data frame in a list using dplyr in R
This is my dataframe: dput(df) structure(list(ind = structure(c(16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437, 16437,…
-
4
votes2
answers4518
viewsSelect multiple lines of a data.frame from the highest R values
I have the following date.frame in R: df <- data.frame(x = c(10,10,2,3,4,8,8,8), y = c(5,4,6,7,8,3,2,4)) df x y 1 10 5 2 10 4 3 2 6 4 3 7 5 4 8 6 8 3 7 8 2 8 8 4 First point: I would like to get…
-
4
votes1
answer139
viewsPerform mutate in columns simultaneously
Hello, I have a dataframe where I want to apply the same function in several columns at the same time. I tried to use the dplyr::mutate_at but I don’t think I understand the logic of this operation.…
-
4
votes3
answers4047
viewsA: How to count and sum the amount of a certain "factor" in the observations (lines) of a data.frame?
Dear, would like to get the amount of "yes" (factor) on each line of a data.frame, as follows. Would anyone know what arguments I would have to use to do this with "mutate"? I tried several ways and…
-
4
votes1
answer95
viewsHow to use dplyr within a function?
Let’s say I wanted to create a function that internally uses some functions of dplyr or any package from tidyverse using this kind of syntax. For illustration purposes: exemplo <- function(df,…
-
4
votes1
answer93
viewsIncluding columns in a dataframe in R using a rule
This is my dataframe: df<-as.data.frame(matrix(rexp(200, rate=.1), ncol=10)) colnames(df)<-c("one","two","three","four","five","six","seven","eight","nine","ten") This is the entry I will use…
-
4
votes1
answer136
viewsCreating a dataframe based on two other dataframes using dplyr in R
These are my dataframes: df<- as.data.frame(matrix(rexp(200),, 25)) colnames(df)<-c("A","B","C","D","E","F","G","H","I","J", "K","L","M","N","O","P","Q","R","S","T", "U","V","X","Z","W")…
-
4
votes1
answer1149
viewsHow to calculate percentage change with 3 variables in R
I have the following data: library(sidrar) Tab1612SojaQde <-get_sidra(1612,variable = 214, period = c("last"=22), geo="State",classific = 'c81', category = list(2713)) head(Tab1612SojaQde)…
-
4
votes1
answer90
viewsMerge into two worksheets format . csv in R
I’m doing a job using the transparency portal, I need to join two databases prof1.csv and prof2.csv. The final result of merge, that I named prof.csv, is doubling rows due to columns 18 of gross…
-
4
votes3
answers71
viewsConditional column based on multiple dplyr lines
I have this df: structure(list(id = c("R054", "R054", "R054", "R054", "R054", "GT68U", "GT68U", "GT68U", "GT68U", "GT68U", "G001", "G001", "G001", "G001"), car1 = c("sim", "sim", "sim", "sim",…
-
4
votes1
answer72
viewsHow to transpose from "wide" to "long" (wide to long) with several variables?
I have a dataframe with multiple variables, as in the example below: df <- read.table(header=TRUE, text=" ID COR TIPO SITUACAO_2016 SITUACAO_2017 SITUACAO_2018 SITUACAO_2019 SITUACAO_2020…
-
3
votes2
answers170
viewsIn R, using dplyr, create a new matrix
Suppose I have the following database >data zona candidato votos 1 A 100 1 B 20 2 A 30 2 B 15 I want, using dplry, the following matrix >nova zona votos_zona votosA votosB 1 120 100 20 2 45 30…
-
3
votes1
answer1331
viewsHow to join two data.frames in R with different variables and out of order?
I have two date frames.: frame1 <- data.frame(dia=c("02/01/2017","03/01/2017","04/01/2017","05/01/2017"), y=c(2,2,1,2),w=c(4,4,2,2),z=c(25,16,24,30), k=c("sim","nao","sim","nao")) frame2 <-…
-
3
votes2
answers94
viewsDifferent Records
I have two tables: MATRICULA_A <-c(123,234,345,456) dados_1 <- data.frame(MATRICULA_A) MATRICULA_A <-c(345,456,111,222,333,444) dados_2 <- data.frame(MATRICULA_A) I need to extract only…
-
3
votes1
answer64
viewsperform . Globalenv function in parallel processing
I need to execute a function that is in . Globalenv in a parallel processing using the multidplyr package. Using a simple example without parallel processing, it works as expected: library(dplyr)…
-
3
votes2
answers80
viewsIndicator in variable-conditioned R with duplicate values
Suppose there is a basis with two variables as follows: Município IF RIOBOM Cooperativa RIOBOM Cooperativa ABADIA Múltiplo ABADIA Múltiplo ABADIA Cooperativa ABADIA Banco DOURADOS Banco DOURADOS…
-
3
votes1
answer587
viewsHow to group information into a data frame from missing data?
I need to exclude empty lines from the df of a 30-year time series, with three daily measurements for each variable. I have already used the function subset(x, ...) that solves part of the problem.…
-
3
votes1
answer611
viewsA - How to create a delayed variable (lag) conditioned to the individual?
I need to delay a variable from my db (dCoopCred). However, it cannot mix the delay of two individuals (CNPJ). I would like that LAG_Result_ant_desp were Result_ant_desp in t-1 (previous period).…
-
3
votes1
answer47
viewsProblem organizing a tidyr dataframe in R
I have this dataframe and I need to organize it so that the single dates are the first column and the columns are the shares of Bovespa with their values being their respective prices:…
-
3
votes1
answer369
viewsHow to run str_detect (stringr) for more than one variable and at once?
I want to filter my database based on two variables: via and city. This filter, however, is made by means of particles of cases present in these two variables. For example, I want to analyze who…
-
3
votes1
answer387
viewsSubtract rows from a group in a data frame by another group
Assuming the following example: set.seed(1234) df=data.frame(grupo=rep(c("A1","A1.2","C","D"), 3), ano=c(rep(2007,4),rep(2008,4),rep(2009,4)), valor1=sample.int(100,12), valor2=sample(20,12),…
-
3
votes1
answer545
viewsCreate new column from existing column in R
I have a database in which the first column has the code of the disciplines of my institution and the second column has the name of the respective discipline. I want to create a third column where…
-
3
votes1
answer42
viewsCreate new rows and columns for non-existent values
I have a date frame where the "years" column is not filled in all the data I need. I would need, for each observation, the 1988 to 2014 scale, filling with 0 (zero) the years whose values do not…
-
3
votes2
answers74
viewsSelect dataframe information based on specific conditions
I am working with election data for mayors in Brazil and would like to select in my database only data related to municipalities based on two conditions: Municipalities where the dispute took place…
-
3
votes1
answer167
viewsHow to partially disregard NA in R operations with a historical data series?
I have a set of rain data measured every hour and I need to add this data throughout the day. For this, I am using the commands below: library(dplyr) df %>% group_by(data) %>%…
-
3
votes2
answers56
viewsApply a function to a dataframe R
I created a function to automatically check if the value of a column is contained in a list. I could do dplyr::mutate + dplyr::ifelse, but as they are for many columns, it would be a very long code.…
-
3
votes1
answer45
viewsAutomatically creating new variables through interaction between two pre-existing variables
Suppose I own the following set of dados: dados #> letras numeros cores valor #> 1 a 1 branco 2 #> 2 a 1 preto 1 #> 3 a 2 branco 9 #> 4 a 2 preto 4 #> 5 a 3 branco 8 #> 6 a 3…
-
2
votes1
answer3371
viewsHow to change the class of a variable within a table/ data frame/ Tibble?
I have a table called tab01 with the following variables (columns) and their respective classes in parentheses: uf (Character), regiao (Character), ano (double) and pop (double). I want, inside the…
-
2
votes1
answer119
views -
2
votes1
answer34
viewsOperation with 2 columns dynamically
Be the data frame v1 v2 v3 v4 v5 a x 2 1 4 b y 4 3 5 c z 6 5 6 Calculate, for example: P1=v3+v4 P2=v4+v5 is trivial, because I can do this for each one manually, so I would have: v1 v2 v3 v4 v5 p1…
-
2
votes1
answer134
viewsIn R, use dplyr functions to find the minimum distance
I have a matrix with two numerical variables: lat and long. Like this: > head(pontos_sub) id lat long 1 0 -22,91223 -43,18810 2 1 -22,91219 -43,18804 3 2 -22,91225 -43,18816 4 3 -22,89973…
-
2
votes1
answer284
viewsSelect ID vectors with certain characteristics in R
I have a data frame with four columns of values for each ID and need to create a new df excluding Ids whose vectors have more than one zero or more than one NA. I created the DF library(dplyr)…