1
I have this table:
tabela<-data.frame(v1 = c(1,NA,3,5,4), v2 =c(NA,NA,1,2,4), v3 = c(6,5,4,7,NA))
I need the lost values of each column to receive the average values of that column.
How to do this using the dplyr
or a repeat loop?
1
I have this table:
tabela<-data.frame(v1 = c(1,NA,3,5,4), v2 =c(NA,NA,1,2,4), v3 = c(6,5,4,7,NA))
I need the lost values of each column to receive the average values of that column.
How to do this using the dplyr
or a repeat loop?
1
Can you do with the dplyr
yes. Just use the function mutate_all
, indicating where values should be changed (is.na
) and how they should be filled (mean
with the argument na.rm = TRUE
):
library(tidyverse)
tabela <- data.frame(v1 = c(1,NA,3,5,4),
v2 =c(NA,NA,1,2,4),
v3 = c(6,5,4,7,NA))
tabela
#> v1 v2 v3
#> 1 1 NA 6
#> 2 NA NA 5
#> 3 3 1 4
#> 4 5 2 7
#> 5 4 4 NA
tabela %>%
mutate_all(~ ifelse(is.na(.x), mean(.x, na.rm = TRUE), .x))
#> v1 v2 v3
#> 1 1.00 2.333333 6.0
#> 2 3.25 2.333333 5.0
#> 3 3.00 1.000000 4.0
#> 4 5.00 2.000000 7.0
#> 5 4.00 4.000000 5.5
# conferindo as medias com os valores nao-imputados
tabela %>%
summarise_all(mean, na.rm = TRUE)
#> v1 v2 v3
#> 1 3.25 2.333333 5.5
Created on 2020-06-19 by the reprex package (v0.3.0)
Browser other questions tagged r dplyr
You are not signed in. Login or sign up in order to post.