-1
Starting from a variable var palavra = 'nascer'
(for example) which regular expression takes all the words whole of a text? Including words that have an accent at the beginning or end.
Considerations:
- Word is not substring - this means that
"nasceram".includes("nascer")
returnstrue
; for me it doesn’t work because I consider words and not substrings (did you understand? qq thing tell me in the comment) - I am taking care not to consider white spaces and punctuation (before and after) because the variable
palavra
only have the 'word' in fact: nothing before, nothing after. - I used boundaries that’s pretty decent, but the boundaries don’t pick up accent
I did it:
var palavra = 'nascer'
// a regex que pega todas as palavras 'nascer'
const regex = new RegExp(`\\b${palavra}\\b`, 'g');
With this 'boundaries' and the 'global'. It works. It takes all words, and integers, without derivatives like 'born' or 'born'.
but boundaries does not accept accent. so it does not take when the variable starts or ends with accent
var palavra = 'água'
//ou
var palavra = 'café'
In the above cases it does not solve;
How to improve this regex to select (from variable) whole words, including with accent at beginning or end?
I tried something with ^
and the $
but it did not happen
/^[A-Za-záàâãéèêíïóôõöúçñÁÀÂÃÉÈÍÏÓÔÕÖÚÇÑ ]+$/
//ou isso
/^[a-záàâãéèêíïóôõöúçñ ]+$/i
but I don’t know how to put a variable as a selection criterion
Um, perfect. Any ideas about dealing with case sensitive? for example
testar('água', 'Água nas águas, deságua');
he also exchanges 'Water' for X ? I tried the 'Ugi' option but did not take.– Luke Negreiros
@Lukenegreiros The flag
i
it should work: https://ideone.com/ZOU3TZ - I don’t know if it is one of those things that vary according to the browser, because nowadays the support for Unicode is already very reasonable, but I would not doubt...– hkotsubo
my implementation is wrong. Look at this: https://jsfiddle.net/apw1ruy2/ is a program to mark repeated words within a sentence, with more than three letters. And in function
marcar_palavra_repetida()
down there, I create afraseHTML
and I use the 'word' variable of the parameter to construct this string, so it would never mark two equal words by ignoring the case sensitive (note the second line, the word 'Guess' with 'guess' in the same sentence - I’d like you to also mark).– Luke Negreiros
@Lukenegreiros The problem with "Advinha" is that
indexOf
is not case insensitive: https://jsfiddle.net/4yk6xf83/1/– hkotsubo