How to replace lost data frame values with the average of each column in R?

Question

How to replace lost data frame values with the average of each column in R?

Asked 5 years, 1 month ago

Viewed 78 times

1

I have this table:

tabela<-data.frame(v1 = c(1,NA,3,5,4), v2 =c(NA,NA,1,2,4), v3 = c(6,5,4,7,NA))

I need the lost values of each column to receive the average values of that column.

How to do this using the dplyr or a repeat loop?

1 answer

Browser other questions tagged r dplyr

You are not signed in. Login or sign up in order to post.

by Marcus Nunes • **17,915** points · Answer 1 · 2020-06-19T17:19:19+00:00

Can you do with the dplyr yes. Just use the function mutate_all, indicating where values should be changed (is.na) and how they should be filled (mean with the argument na.rm = TRUE):

library(tidyverse)

tabela <- data.frame(v1 = c(1,NA,3,5,4), 
                     v2 =c(NA,NA,1,2,4), 
                     v3 = c(6,5,4,7,NA))

tabela
#>   v1 v2 v3
#> 1  1 NA  6
#> 2 NA NA  5
#> 3  3  1  4
#> 4  5  2  7
#> 5  4  4 NA

tabela %>% 
    mutate_all(~ ifelse(is.na(.x), mean(.x, na.rm = TRUE), .x))
#>     v1       v2  v3
#> 1 1.00 2.333333 6.0
#> 2 3.25 2.333333 5.0
#> 3 3.00 1.000000 4.0
#> 4 5.00 2.000000 7.0
#> 5 4.00 4.000000 5.5

# conferindo as medias com os valores nao-imputados

tabela %>%
    summarise_all(mean, na.rm = TRUE)
#>     v1       v2  v3
#> 1 3.25 2.333333 5.5

^{Created on 2020-06-19 by the reprex package (v0.3.0)}