Replace NA in R language

Asked

Viewed 5,016 times

4

I would like to replace NA (Missing) for a word. I have the following data:

structure(list(NOME = c("ABC", "ADD", 
"AFF", "DDD", "RTF", "DRGG"
), TIPO = c("INTERNACAO", "", "CONSULTA", "EXAME", "", "EXAME"
), VALOR = c(10L, 20L, 30L, 40L, 50L, 60L)), class = "data.frame", row.names = c(NA, 
-6L))

#NOME        TIPO  VALOR
#ABC   INTERNACAO     10
#ADD                  20
#AFF     CONSULTA     30
#DDD        EXAME     40
#RTF                  50
#DRGG       EXAME     60

How to replace NA by the word TESTE?

  • please improve data formatting if possible, indicate what function you are using and what your goal is with this data.

  • Try taking a look at this video, where they teach you how to replace NA values (in English): https://www.youtube.com/watch?v=LBVaCCKeo0

  • Hello, ovoid shows for all columns. I would like you to replace the NA in a specific column.

3 answers

5


Assuming your dice are on one data frame called dat, and that the column you want to replace the NA is called TIPO:

dat$TIPO[which(is.na(dat$TIPO))] <- "TESTE"

According to your data, I don’t see NA in the column TIPO but empty elements. In this case, instead of NA, you use " ".

dat$TIPO[which(dat$TIPO == " ")] <- "TESTE"
  • Thank you very much!

  • @Brunoavila If you solved your doubt, it would be excellent to accept it.

  • I used the expression dat$TYPE[which(is.na(dat$TYPE))] <- "TEST" and it worked. However, how would you replace the NA with the contents of the next column? Example: in the ADD line, the NA would be replaced by the number 20. Grateful

3

The dplyr has a function called coalesce which serves exactly for this.

In your case, you could use:

library(dplyr)
dat$TIPO <- coalesce(dat$TIPO, "TESTE")

1

The function case_when package dplyr also does what you want. Before, note that there is a difference between NA (missings) and "" (Empty Cells), as quoted in this reply.

First, reproduce your data:

df_1 <- structure(list(NOME = c("ABC", "ADD", "AFF", "DDD", "RTF", "DRGG"
), TIPO = c("INTERNACAO", "", "CONSULTA", "EXAME", "", "EXAME"
), VALOR = c(10L, 20L, 30L, 40L, 50L, 60L)), class = "data.frame", row.names = c(NA, 
-6L))

#  NOME       TIPO VALOR
#1  ABC INTERNACAO    10
#2  ADD               20
#3  AFF   CONSULTA    30
#4  DDD      EXAME    40
#5  RTF               50
#6 DRGG      EXAME    60

Now the analysis:

library(dplyr)

df_1 %>% 
  mutate(across(TIPO, ~ case_when(. == "" ~ "TESTE", TRUE ~ .)))

#  NOME       TIPO VALOR
#1  ABC INTERNACAO    10
#2  ADD      TESTE    20
#3  AFF   CONSULTA    30
#4  DDD      EXAME    40
#5  RTF      TESTE    50
#6 DRGG      EXAME    60

If there were NAs (missings) instead of "" (Empty Cells) the code would look like this:

df_1 %>% 
  mutate(across(TIPO, ~ case_when(is.na(.) ~ "TESTE", TRUE ~ .)))

Browser other questions tagged

You are not signed in. Login or sign up in order to post.