How to exclude a specific row from a base in R


Viewed 121 times


I’m validating a base on R and I’ve identified some lines that don’t make sense so I’ll have to exclude some of them. What is the best way to delete lines containing 'N', 'P' and 'K' from the base$fraud variable in the example below? I tried this function and it didn’t work: subset(renamed base_fraud = "K")

Excluir as Linhas N, P, S

3 answers


base_renomeada <- base_renomeada[!base_renomeada %in% c("N", "P", "K"), ]

Or if you want to inform those you want to keep:

base_renomeada <- base_renomeada[base_renomeada %in% c("S", "M"), ]


With tidyverse you can do so:

df <- data.frame(
  fraude = c('K', 'M', 'N', 'P', 'S'), 
  valores = c(1, 2, 18405914, 1, 111044)

  fraude  valores
1      K        1
2      M        2
3      N 18405914
4      P        1
5      S   111044


df %>% 
  filter(! fraude %in% c('N', 'P', 'K'))

  fraude valores
1      M       2
2      S  111044


With subset as in the question will be

subset(base, !fraude %in% c("N", "P", "K"))

It is more effective in terms of performance to use a logical index.

i <- !base$fraude %in% c("N", "P", "K")
result <- base[i, ]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.