Operations with Data Frame

Asked

Viewed 145 times

3

I need to do as the image below. What would it look like? I’ve already managed to isolate lines that are just zeros in all columns using the subset(Base, VALOR1==0&VALOR2==0&VALOR3==0&VALOR4==0), however I cannot do a treatment between the data frame that contains only zeros of the complete data frame.

inserir a descrição da imagem aqui

structure(list(CADASTRO = structure(c(8L, 1L, 6L, 9L, 7L, 3L, 
5L, 4L, 2L), .Label = c("ANTONIA", "ANTONIO", "FABIO", "FILOMENA", 
"MARCUS", "MARIA", "MONICA", "TEREZA", "THIAGO"), class = "factor"), 
    VALOR1 = c(0L, 10L, 0L, 0L, 0L, 20L, 45L, 0L, 0L), VALOR2 = c(0L, 
    0L, 0L, 0L, 0L, 30L, 32L, 0L, 23L), VALOR3 = c(0L, 20L, 0L, 
    0L, 0L, 45L, 12L, 0L, 21L), VALOR4 = c(0L, 45L, 0L, 0L, 0L, 
    90L, 22L, 0L, 100L), VALOR5 = structure(1:9, .Label = c("A", 
    "B", "C", "D", "E", "F", "G", "H", "I"), class = "factor"), 
    VALOR6 = structure(c(8L, 7L, 8L, 1L, 3L, 6L, 4L, 5L, 2L), .Label = c("", 
    "*", "1", "2.2", "3", "D", "NOK", "OK"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))
> base
  CADASTRO VALOR1 VALOR2 VALOR3 VALOR4 VALOR5 VALOR6
1   TEREZA      0      0      0      0      A     OK
2  ANTONIA     10      0     20     45      B    NOK
3    MARIA      0      0      0      0      C     OK
4   THIAGO      0      0      0      0      D       
5   MONICA      0      0      0      0      E      1
6    FABIO     20     30     45     90      F      D
7   MARCUS     45     32     12     22      G    2.2
8 FILOMENA      0      0      0      0      H      3
9  ANTONIO      0     23     21    100      I      *

> library(dplyr)
> dados = base %>% select('CADASTRO','VALOR1','VALOR2','VALOR3','VALOR4')
> dados
  CADASTRO VALOR1 VALOR2 VALOR3 VALOR4
1   TEREZA      0      0      0      0
2  ANTONIA     10      0     20     45
3    MARIA      0      0      0      0
4   THIAGO      0      0      0      0
5   MONICA      0      0      0      0
6    FABIO     20     30     45     90
7   MARCUS     45     32     12     22
8 FILOMENA      0      0      0      0
9  ANTONIO      0     23     21    100

> base_0 = subset(dados, VALOR1==0&VALOR2==0&VALOR3==0&VALOR4==0)
> base_0
  CADASTRO VALOR1 VALOR2 VALOR3 VALOR4
1   TEREZA      0      0      0      0
3    MARIA      0      0      0      0
4   THIAGO      0      0      0      0
5   MONICA      0      0      0      0
8 FILOMENA      0      0      0      0
  • Hello Fabio, show how your code is so far so we can help you in a better way.

  • provide the data in the question pfv: https://pt.meta.stackoverflow.com/questions/6700/como-fazer-uma-questiona-reproduces%C3%Advel-em-r

  • Doubt best presented!

2 answers

4


You can use the same function subset() with the operator ! which means a denial. More information here.

dados = subset(dados, !(VALOR1==0&VALOR2==0&VALOR3==0&VALOR4==0))

0

Only to complement the above answer. We can do this dynamically (without having to name the columns). This can be useful if you have a large number of columns or if you are not sure of the nature of the columns. In your example, you are only interested in columns that have as a class integer. Therefore, we can first filter which columns satisfy this condition:

cols <- sapply(dados, function(x) is.numeric(x))

cols <- names(which(cols==TRUE))

Inspecting the object cols:

> cols
[1] "VALOR1" "VALOR2" "VALOR3" "VALOR4"

Now let’s use lapply() and Reduce(). lapply() why we need a type object list in Reduce(). Reduce() will match the objects of this list one by one successively through the function specified in its first argument .f. Therefore:

dados[!Reduce(`&`, lapply(cols, function(x) dados[[x]] == 0)), ]

Returns:

  CADASTRO VALOR1 VALOR2 VALOR3 VALOR4 VALOR5 VALOR6
2  ANTONIA     10      0     20     45      B    NOK
6    FABIO     20     30     45     90      F      D
7   MARCUS     45     32     12     22      G    2.2
9  ANTONIO      0     23     21    100      I      *

As explained before, the great advantage of this method is that it generalizes to any data.frame in which you want to apply the condition (in your case, return rows that do not have all their columns of type integer filled with 0).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.