Compare columns of a dataframe with those of others and remove columns that are not common between them

Asked

Viewed 559 times

1

Suppose I have 3 dataframes. In them, I have various columns (e.g. X1, x2..., Xn). However, not all these columns coexist in all dataframes. My goal is to compare these dataframes and leave, EACH OF THEM, with the columns in common.

It is possible to perform this procedure with only ONE function?

3 answers

4


The basic idea is this, all you have to do is to automate the process and cover more dataframes.

df1 = data.frame(x1=runif(5,0,5), x2=runif(5,5,10), x3=runif(5,0,5), x4=runif(5,10,15))
df2 = data.frame(x1=runif(5,0,5), x2=runif(5,5,10), x4=runif(5,10,15))
df3 = data.frame(x2=runif(5,0,5), x3=runif(5,5,10), x4=runif(5,10,15))

idem_cols <- intersect(intersect(colnames(df1), colnames(df2)), colnames(df3))

> df1[idem_cols]
#        x2       x4
#1 6.393069 12.99105
#2 7.016564 12.57616
#3 9.451348 11.62159
#4 5.728012 11.23728
#5 8.795608 13.79248

> df2[idem_cols]
#        x2       x4
#1 9.489572 12.21699
#2 7.423554 11.57359
#3 5.058671 10.75123
#4 9.319093 10.00097
#5 5.620968 14.91703

> df3[idem_cols]
#         x2       x4
#1 2.5554488 13.83610
#2 4.4639556 10.05555
#3 4.1599600 14.10665
#4 0.4610773 10.21153
#5 2.9923365 14.80820

2

An add-on to the @Fernandes response is given below:

list<-list(df1[idem_cols],df2[idem_cols],df3[idem_cols])
list # cria uma lista com as colunas comuns dos dataframes

> list
[[1]]
    x2       x4
1 7.796689 14.54941
2 9.473103 14.15803
3 7.818807 10.96527
4 6.381239 14.44439
5 9.552761 12.73286

[[2]]
    x2       x4
1 5.755445 11.08562
2 8.305431 11.57553
3 7.006299 12.62098
4 7.949986 13.11914
5 6.095582 10.30344

[[3]]
     x2       x4
1 0.6701076 14.23146
2 4.5605675 11.67825
3 0.8683714 11.08652
4 2.9171325 10.14618
5 3.8379593 14.99512

after, a specific name is created for each dataframe through a loop for:

for(i in 1:length(list)){
    assign(paste('df',i,sep=''),
    value=data.frame(list[[i]]))
}

the result becomes:

inserir a descrição da imagem aqui

which will be useful to apply some functions (such as tapply in multiple dataframes).

0

Another function would be:

result<-Reduce(function(x,y)intersect(x,y),list(colnames(df1),colnames(df2),colnames(df3),colnames(df4),colnames(df5)))

where you can compare how many dataframes you want.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.