This is because regular expression disregards which sequences of e
can come one after the other.
That way, even though this works:
/(\w+\se\s\w+)/g
foo aaa e bbb bar baz ccc e ddd qux
This wouldn’t work:
/(\w+\se\s\w+)/g
foo aaa e bbb e ccc qux
Because the regular expression /(\w+\se\s\w+)/g
does not determine the match two sequences in a row from each other. That’s because it requires before and afterward of e
. In case you’ve already given match in an expression immediately prior to the next e
, will have nothing "before" it, so that the match will be impossible for not meeting this condition.
One solution is to indicate that any term after the e
can be repeated within a single match. An option would be like this:
/\w+(?:\se\s\w+)+/g
foo aaa e bbb bar baz ccc e ddd e eee qux
See on Regex101.
Although the above regular expression works for cases where words are formed by ASCII alphanumeric characters, accented letters (such as é
, á
, à
etc) are not covered by \w
.
So you can change the expression to:
/\p{L}+(?:\se\s\p{L}+)+/gu
foo aáà e bbb bar baz ccc e ddd e eéè qux
See on Regex101.
So that by using the flag u
, can be used \p{L}
, that captures any letter defined by the Unicode standard - which includes the aforementioned accented characters.
While already well supported, some environments may not implement regular expressions with flag unicode. In such cases, for alternatives to \p{L}
with the flag u
, consult the another answer.
Not related to the answer, but it is worth noting that the regular expression original of the question (/(\w+\se\s\w+)/g
) could be replaced by /\w+\se\s\w+/g
, since the capture group in this case does nothing.
If the phrase changes to
"olá eu e ela temos o numero é trezentos e vinte e quatro tudo bem?"
will give problem.– Augusto Vasques
I’m upping the answers, but I think the problem is much bigger than regex is NLP.
– Augusto Vasques
Well placed Ugusto, I did a search now in nlp and there are some libraries in Ode that do several negotiations for what I understand right? I’ll dig a little deeper into the subject, thank you very much!!
– Jeferson