Replacing NA values of a column by the value of the top row of the same column of a dataframe

Asked

Viewed 54 times

2

I have the data frame below, and I want to create a script to replace the NA by the value of the top row of the same column.

DADOS <- data.frame(
  a = c(1, 2, NA, 3, 4), 
  b = c(5, 6, 7, NA, NA)
)

Below the result as you would like:

DADOSRESULTADO <- data.frame(
  a = c(1, 2, 2, 3, 4), 
  b = c(5, 6, 7, 7, 7)
)

3 answers

5

The function na.locf package zoo does exactly what is requested:

library(zoo)
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric

na.locf(DADOS)
#>   a b
#> 1 1 5
#> 2 2 6
#> 3 2 7
#> 4 3 7
#> 5 4 7

Created on 2021-06-02 by the reprex package (v2.0.0)

4

An alternative would be to use Fill package tidyr:

library(tidyr)

DADOS <- data.frame(
  a = c(1, 2, NA, 3, 4), 
  b = c(5, 6, 7, NA, NA)
)

DADOS %>% fill(a,b)

Exit:

  a b
1 1 5
2 2 6
3 2 7
4 3 7
5 4 7

2

One more option, the function data.table::nafill. The variation setnafill can be used to modify by reference:

library(data.table)

setnafill(DADOS, "locf")

DADOS
#>   a b
#> 1 1 5
#> 2 2 6
#> 3 2 7
#> 4 3 7
#> 5 4 7

You can also use "const" to fill in with a fixed value and "nocb" to use the following value. The option cols allows defining columns by name or number (default is to apply to all).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.