Use of the GREP Function

Asked

Viewed 417 times

4

Hello,

I have a . csv file with the zip code in the middle of the address, in square brackets. In some cases there is no zip code, but there is always the set[], possibly empty, for example:

RUA ESTEVAM DE ARAÚJO DE ALMEIDA 521 L 17 Q. 15 [23028730] GUARATIBA

I want this information in a new variable. For this, I wrote an R code to extract the cep through the function grep, but is generating error:

hans$cep <- grep("\\[*?\\d{8}\\]", hans$endereco.do.domicilio, value = T)

Error in `$<-.data.frame`(`*tmp*`, "cep", value = c("RUA ESTEVAM DE ARAÚJO DE ALMEIDA 521 L 17 Q. 15 [23028730] GUARATIBA",  : 
  replacement has 59940 rows, data has 61674
  • tries to use the function str_extract package stringr

  • The error occurs because the grep returns only the values you find, as opposed to the str_extract returning NA when nothing is found. Hence the error occurs inside the data.frame , since there are fewer values being returned than lines in the data.frame

2 answers

1

Good morning, try using the following code

library(tidyverse)
hans <- hans  %>%
   mutate(CEP = str_extract(endereco.do.domicilio, "\\[*?\\d{8}\\]")

0

Uses that function. CEP=$(echo $endereco_completo | TR "\[" "\n" | grep ']' | cut -c1-8)

Pay attention to the quotation marks, because it can be a mistake if the grep is " or winter '.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.