4
Let’s have this URL extracted
/ac/rio-branco/xpto-xyz-1-0-16-5-abcd-a1G57000003DE4QEAW
And I just want the piece that starts with a1G
, someone knows how I only get this bit?
4
Let’s have this URL extracted
/ac/rio-branco/xpto-xyz-1-0-16-5-abcd-a1G57000003DE4QEAW
And I just want the piece that starts with a1G
, someone knows how I only get this bit?
2
You can do using the package stringr
and regular expressions.
In your case, I would do so:
s <- "/ac/rio-branco/xpto-xyz-1-0-16-5-abcd-a1G57000003DE4QEAW"
stringr::str_extract(s, "a1G\\S+\\s*")
[1] "a1G57000003DE4QEAW"
This code works even if s
is a vector, so it would work in a data.frame
as follows:
df$extrair <- stringr::str_extract(df$url, "a1G\\S+\\s*")
Note that if you don’t have the package stringr
installed, you will need to install it using the command install.packages("stringr")
.
1
Extract part of a string using only the package base
is pretty boring, but possible. I chose a simpler regular expression than Daniel’s, since you weren’t very specific. It would look like this:
> s <- "/ac/rio-branco/xpto-xyz-1-0-16-5-abcd-a1G57000003DE4QEAW"
> regmatches(s, gregexpr("a1G.+", s))
[[1]]
[1] "a1G57000003DE4QEAW"
Note that the result is a list, which will contain an element for each vector string s
, with all occurrences of the regular expression. If you want only one vector as output, you can use unlist:
> s <- c("/ac/rio-branco/xpto-xyz-1-0-16-5-abcd-a1G57000003DE4QEAW", "abcsda1G000")
> regmatches(s, gregexpr("a1G.+", s))
[[1]]
[1] "a1G57000003DE4QEAW"
[[2]]
[1] "a1G000"
> unlist(regmatches(s, gregexpr("a1G.+", s)))
[1] "a1G57000003DE4QEAW" "a1G000"
Thank you so much for your help!!!
Browser other questions tagged r regex
You are not signed in. Login or sign up in order to post.
Thank you so much for your help!!!
– Felipe Amaral Rodrigues