How to insert column average in all NA values

Asked

Viewed 146 times

1

Hello.

How to insert the average in all NA values. I have a code that I read the file, I check if it’s in and it doesn’t have, but when I turn it into a number, several NA appear and if I remove it, the data is greatly reduced from 4000 to about 600:

df<-read.csv("autores.csv", header=T, stringsAsFactors=F, sep=";")  

table(is.na(df))  #não há NA

df_numero<-lapply(df[-1], as.numeric)  

#recria o dataframe pois lapply retorna lista  
df1<-data.frame(df_numero)  

table(is.na(df))  #há NA
  • This must be because the data needs to be cleaned, there are probably strange characters such as commas (1,234.00) or things like that. See first how the values that disappear from df are and only then apply as.numeric.

1 answer

1


Follow code to replace the NA by the average of the column where they meet:

for(i in 1:nrow(df)){
  for(j in 1:ncol(df)){
    if(is.na(df[i,j])){
      df[i,j] <- mean(df[,j], na.rm = T)
    } 
  }
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.