Regex to pick up capitalized words in the middle of a sentence

Question

Regex to pick up capitalized words in the middle of a sentence

Asked 4 years, 7 months ago

Viewed 93 times

0

my need is to pick up words that start with capital letters in the middle of a sentence.

Example: "Facebook and Twitter are interesting. You can register anytime".

In the example above, I need to get Facebook and Twitter, but not "You".

My idea would be to take all the words with capital letters in which before there was no point and a space in sequence. I’ve tried a few things, but nothing’s right.

1 answer

Browser other questions tagged javascript regex

You are not signed in. Login or sign up in order to post.

by Lucas • **3,858** points · Answer 1 · 2020-12-02T02:16:38+00:00

The solution is to use Negative Lookbehind. So your regex would look like this:

'(?<!\.\s)[A-Z][a-z]+'

Complete code:

const regexp = /(?<!\.\s)[A-Z][a-z]+/g
const text = "O Facebook e Twitter são interessantes. Você pode fazer cadastro a qualquer hora"

const result_array = text.match(regexp)
console.log(result_array)

What are lookbehinds

lookbehinds are specific cases of lookarounds. Here is the definition of book by Jeffrey Friedl:

lookaround are similar to word limit metacharacters like b or anchors and $ in the sense that they do not match the text, but rather positions with the text. However, lookaround is a much more general construct than the limit and special case word anchors

A lookaround type, called Lookahead, looks ahead in the text (to the right) to see if its subexpression can match and is successful as a regex component. The positive Lookahead is specified with the special sequence! (?=...) as in (?= d), which is successful at the positions where the digit comes next. Another type of lookaround is the lookbehind, looking back (to the left). It is given with the special sequence (?<=...), as in (?<= d)", which is in successful positions with one digit to the left (i.e., in the positions after one digit). (FREE TRANSLATION)

Only complementing, (?<=...) is the Positive Lookbehind, while in your case we use (?<!...) which is the Negative Lookbehind. I recommend reading this section of the book for further clarification. (See pages 59-67).