How to calculate the percentage of NA in a data frame in R?

Asked

Viewed 2,585 times

0

Hello,

I’m working with a large data frame - 1000 variables and 60,000 lines - and I need to calculate the percentage of NA and whitespace for each of the variables separately.

What’s the best way to do it in R?

2 answers

3

To count NA by columns you can use the function colSums():

# total de linhas
n = nrow(df)

# porcentagem de NA por coluna
round(colSums(is.na(df))*100/n, 2)

Or you can also use the function apply():

# função para contar NA's
sum_NA <- function(dados){
  sum(is.na(dados))
}

# total de linhas
n = nrow(df)

# aplicando a função em cada coluna
round(apply(df, 2, sum_NA)*100/n, 2)
  • 3

    Proportion of AN per column: colMeans(is.na(df)). (For percentage should multiply by 100.)

  • Exactly, I forgot that detail.

  • Thank you very much, Fernandes and Rui Barradas. I’m still crawling with the R and your help was very valuable!

0

Well come on, one of the ways to do it is to create a loop and catch column by column of your data frame.

I created a data frame to illustrate

df <- data.frame(A=c(NA,2,'',1),B=c('',4,4,2),C=c(5,'','',''),D=c(7,7,5,4),E=c('','',NA,NA),F=c(9,9,0,6))

Note that some of them have blank values and NA...

for (i in 1:ncol(df)){
    print(sum(is.na(df[,c(i)]   )   | df[,c(i)] == ""  )/length(df[,c(i)]) * 100)
}

This is a loop that walks in each column and calculates the percentage you need, based on my data frame for will print the following results:

[1] 50
[1] 25
[1] 75
[1] 0
[1] 100
[1] 0

want something simpler and maybe faster ? try:

print(colMeans(is.na(df) | df == "")*100)

That gives the following exit:

  A   B   C   D   E   F 
 50  25  75   0 100   0 

Look at that is.na is a function of R who meets all the NA's made a ou(|) to find all voids =="", I think this last option is faster because it only uses functions compiled in a native way from R

  • Daniel and Eder, thank you so much! Valuable help for those who are starting in R like me!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.