0
Guys, I have the following df:
df <- data.frame(X =c("a","b","c","a","b","c","a","b","c","d","a","b","c","d","e"),
Y = c("w","w","w", "K","K","K", "L","L","L","L","Z","Z","Z","Z","Z"))
Note that the first vector has 5 levels and the second has 4 levels. My goal is to select the lines of the df
that have all levels of vector 1 in common as vector 2. That is, I want to select lines that have levels "a","b" and "c", since "d" only appears twice "and" appears once in vector 1.
I tried to make a list with the levels in common and leave only the lines with the levels in common by subset
. However, it doesn’t work because this level list doesn’t generate the address of the lines I want to remove. Ex:
comuns <- c("a","b","c")
df2 <- df[c(comuns),]
In my df
real there are 64 levels in common, so it does not roll do "raw". Someone can help me?
I couldn’t understand what the phrase "select lines from
df
which have all levels of vector 1 in common as vector 2". In particular, I don’t see how this phrase turned into the following phrase: "select lines that have the levels 'a', 'b' and 'c'". Vector 1 is column X? Vector 2 is column Y? In this case, I cannot understand how X and Y can have common levels in this specific example. It would be interesting to edit the question and put the expected answer.– Marcus Nunes
Yes, Marcus. Vector 1 is the X column and Vector 2 is the Y column. I’ve already solved the problem with the help of colleagues below. Thank you!
– Antonio Carlos Porto