How to replace lost data frame values with the average of each column in R?

Asked

Viewed 78 times

1

I have this table:

tabela<-data.frame(v1 = c(1,NA,3,5,4), v2 =c(NA,NA,1,2,4), v3 = c(6,5,4,7,NA))

I need the lost values of each column to receive the average values of that column.

How to do this using the dplyr or a repeat loop?

1 answer

1


Can you do with the dplyr yes. Just use the function mutate_all, indicating where values should be changed (is.na) and how they should be filled (mean with the argument na.rm = TRUE):

library(tidyverse)

tabela <- data.frame(v1 = c(1,NA,3,5,4), 
                     v2 =c(NA,NA,1,2,4), 
                     v3 = c(6,5,4,7,NA))

tabela
#>   v1 v2 v3
#> 1  1 NA  6
#> 2 NA NA  5
#> 3  3  1  4
#> 4  5  2  7
#> 5  4  4 NA

tabela %>% 
    mutate_all(~ ifelse(is.na(.x), mean(.x, na.rm = TRUE), .x))
#>     v1       v2  v3
#> 1 1.00 2.333333 6.0
#> 2 3.25 2.333333 5.0
#> 3 3.00 1.000000 4.0
#> 4 5.00 2.000000 7.0
#> 5 4.00 4.000000 5.5

# conferindo as medias com os valores nao-imputados

tabela %>%
    summarise_all(mean, na.rm = TRUE)
#>     v1       v2  v3
#> 1 3.25 2.333333 5.5

Created on 2020-06-19 by the reprex package (v0.3.0)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.