How to calculate the average of a column in Rstudio but ignore the 0 values in the column?

Asked

Viewed 5,601 times

1

example column 1 = 1 2 3 4 0 0 0 the normal average of this would give 1.428571 but ignoring the 0 would be 2.5, I would like to know how to do this, ignoring the values 0 of the column.

2 answers

1

Assuming the dataset is called dados and have two columns called c1 and c2, with the following values:

dados <- data.frame(c1=c(1:4, rep(0, 3)), c2=7:1)
dados
  c1 c2
1  1  7
2  2  6
3  3  5
4  4  4
5  0  3
6  0  2
7  0  1

do the following:

mean(dados[dados$c1!=0, 1])

The above code selects lines from dados whose values in the first column are different from 0. Also, consider only the first column of the date frame. With the correct rows and column selected, simply calculate the average value.

An alternative way to call the first column is, instead of putting the number 1, to call it by name, as the command below does:

dados[dados$c1!=0, "c1"]

The result will be the same, regardless of the method used.

0

An alternative answer to Marcus Nunes' proposal would be to use the example below. I took the liberty of using the data set proposed by him. I will make use of this example of the packages below. Make sure you have them on your computer.

library(dplyr)
library(magrittr)
dados <- data.frame(c1=c(1:4, rep(0, 3)), c2=7:1)
dados

Assuming you want the average of column C1, you can use the function filter dplyr to filter non-zero data from column C1. Create an object with the result of this operation, then use the function summarise of the same package to generate a column called MEDIA with the average of the filtered data. The solution is presented in two ways, with or without the pipe of the magrittr package.

Without the pipe of the magritrr package

dt.filtro <- filter(dados, c1 != 0)
summarise(dt.filtro, MEDIA = mean(c1))

With the pipe from magrittr, you can avoid creating objects like the 'dt.filter' of the above example.

dados %>% filter(c1 != 0) %>% summarise(MEDIA = mean(c1))

If you want to see the averages of the two columns, based on the deletion of values from column C1, for example, just use a variant of summarise calling for summarise_each.

dados %>% filter(c1 != 0) %>% summarise_each(funs(mean))

Browser other questions tagged

You are not signed in. Login or sign up in order to post.