How to select a text that does not have a certain term in the middle?

Question

How to select a text that does not have a certain term in the middle?

Asked 7 years, 2 months ago

Viewed 206 times

1

I am trying to select a part of an HTML code with Regex but I am not able to do the correct regular expression, someone could give a help?

I need to select the groups of <li> separately, i.e., without the presence of the tag <br> in the middle.

For example, I’m trying with the expression below:

/<li.*(?!<br).*\/li>/gi

And I need to select the following text separately:

<li>Teste 1</li><li>Teste 2</li><li>Teste 3</li>

In this test, i created two occurrences from that list, but the expression is selecting everything from the first occurrence to the last.

How do I select the two lists separately?

1 answer

Browser other questions tagged regex exception

You are not signed in. Login or sign up in order to post.

by hkotsubo • **55,826** points · Answer 1 · 2018-05-31T21:02:10+00:00

The problem of quantifiers * and + is that they are "greedy", that is, they try to take as many characters as possible that satisfies the expression.

To cancel this "greedy" behavior is enough put a ? after the *. With this, the expression will take as few characters as necessary (so *? is also called Lazy quantifier). Then the regex would look like this:

/<li.*?(?!<br).*?\/li>/

You can see it running here.

The above regex takes 6 groups (each tag li) separately. To take a sequence of several li that does not contain br as if they were one thing, just search for 1 or more occurrences of all the previous regex (using the quantifier +):

(<li.*?(?!<br).*?\/li>)+

You can see this regex working here.