How to compare two columns of a spreadsheet, and keep the information that is equal?

Asked

Viewed 1,024 times

2

Hello!

I am working with a spreadsheet in Excel that has this structure:

Coluna_A    Coluna_B
A           A
B           B
C           C
C_1         E
D
E
F

What I want is to find a way to assemble a third column, which has data present in the two columns. So:

Coluna_C
A
B
C
E

Note that the sample unit "E" of the "Coluna_b" is in the same row as "C_1" of the "Coluna_a", but still it must be in the "Coluna_c", because it is in both columns. Would anyone know any code to automate this analysis, whether in Excel, or R?

1 answer

2


Maybe this is what you want.
Note that the elements common to the two columns are the first elements of the vector Coluna_C, whatever its position in the original vectors, Coluna_A or Coluna_B.

dados$Coluna_C <- NA
comuns <- intersect(dados$Coluna_A, dados$Coluna_B)
dados$Coluna_C[seq_along(comuns)] <- comuns
dados
#  Coluna_A Coluna_B Coluna_C
#1        A        A        A
#2        B        B        B
#3        C        C        C
#4      C_1        E        E
#5        D              <NA>
#6        E              <NA>
#7        F              <NA>

If you don’t want values NA, begin with

dados$Coluna_C <- ""

Dice.

dados <- read.table(text = "
Coluna_A    Coluna_B
A           A
B           B
C           C
C_1         E
D
E
F
", header = TRUE, fill = TRUE, stringsAsFactors = FALSE)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.