How to remove a word from a string without changing larger words that contain it

Asked

Viewed 1,310 times

8

I would like to remove a word from a string in R. I was doing it as follows:

> s <- "ele esta bem mas tambem esta triste"
> stringr::str_replace_all(s, "tambem", "")
[1] "ele esta bem mas  esta triste"

So far, so good. The problem is if I just wanted to take the word "well" out of the text.

> stringr::str_replace_all(s, "bem", "")
[1] "ele esta mas tam esta triste"

In this case the word "too" gets cut off, and I didn’t want it to happen.

I thought I’d search the word between spaces:

> stringr::str_replace_all(s, " bem ", " ")
[1] "ele esta mas tambem esta triste"

But then, if I looked for the word "he," it wouldn’t be removed. Is there any way to remove all words without thinking about all cases?

2 answers

8


  • I think the expression "\\b\\s?bem\\s?\\b" would be better, because this way stay three spaces in a row in the final string, and changing " " for "" two. With this expression you remove the spaces before and after the word (if any) and exchange everything for a single one. That still wouldn’t solve the word at the end of the punctuated sentence (bem.), but in that case perhaps the ideal would be a second expression to remove traces.

0

Using the expression suggested by @Molx "\\b\\s?bem\\s?\\b", but with the function gsub()

 gsub("\\b\\s?bem\\s?\\b","",s)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.