How to exclude a specific row from a base in R

Asked

Viewed 121 times

2

I’m validating a base on R and I’ve identified some lines that don’t make sense so I’ll have to exclude some of them. What is the best way to delete lines containing 'N', 'P' and 'K' from the base$fraud variable in the example below? I tried this function and it didn’t work: subset(renamed base_fraud = "K")

Excluir as Linhas N, P, S

3 answers

2

base_renomeada <- base_renomeada[!base_renomeada %in% c("N", "P", "K"), ]

Or if you want to inform those you want to keep:

base_renomeada <- base_renomeada[base_renomeada %in% c("S", "M"), ]

2


With tidyverse you can do so:

df <- data.frame(
  fraude = c('K', 'M', 'N', 'P', 'S'), 
  valores = c(1, 2, 18405914, 1, 111044)
)

  fraude  valores
1      K        1
2      M        2
3      N 18405914
4      P        1
5      S   111044

library(tidyverse)

df %>% 
  filter(! fraude %in% c('N', 'P', 'K'))

  fraude valores
1      M       2
2      S  111044

2

With subset as in the question will be

subset(base, !fraude %in% c("N", "P", "K"))

It is more effective in terms of performance to use a logical index.

i <- !base$fraude %in% c("N", "P", "K")
result <- base[i, ]

Browser other questions tagged

You are not signed in. Login or sign up in order to post.