Extract only uppercase words with R

Asked

Viewed 31 times

1

library(stringr)

I am trying to extract only the uppercase words from a string.

teste <- "Isto é um Teste para ver se Eu consigo capturar APENAS as Palavras TOTALMENTE Maiusculas"
teste

[1] "Isto é um Teste para ver se Eu consigo capturar APENAS as Palavras TOTALMENTE Maiusculas"

Use the str_extract_all() and pass a REGEX that seeks to capture only words "\w" that are branded as uppercase "[:upper:]"

The function until it locates the correct words, but divides them every 2 letters.

str_extract_all(teste, "\\w[:upper:]")

[[1]]
[1] "AP" "EN" "AS" "TO" "TA" "LM" "EN" "TE"

Why?

Which way is right?

1 answer

2


You can check if there is more than one occurrence of uppercase letter within the word limit

str_extract_all(teste ,'\\b[A-Z]+\\b')

or

str_extract_all(teste, "\\b[:upper:]+\\b")

Browser other questions tagged

You are not signed in. Login or sign up in order to post.