Regex to pick up text between <> in python?

Asked

Viewed 1,898 times

2

I would like to extract the text contained between <> of a string. At first, I built the following expression:

import re
m=re.search(r'<(.*)>','abevbv envvrhwkv <eiwbv> ebvi <wieunv> ajhbsvhj')

The expression would be perfect if there were no two texts between <>. In this case, I will have as return:

'eiwbv> ebvi <wieunv'

But I want to:

'eiwbv'

What a regular expression I would have to use to get this result?

2 answers

5

Place an interrogation after the asterisk, like this:

m=re.search(r'<(.*?)>','abevbv envvrhwkv <eiwbv> ebvi <wieunv> ajhbsvhj')

print (m)
<_sre.SRE_Match object; span=(17, 24), match='<eiwbv>'>

DEMO

2


Your Regex is almost correct, what went wrong is that you used a quantifier Greedy (greedy) no point (.).

This causes the regex to search until the last occurrence in which it can be embedded, always leaving the captured group as large as possible, you should have using a quantifier Lazy (lazy), causing it to always stop catching the first occurrence of the delimiter, in your case ">"

To solve your problem you only need to change the * for *?

import re
m=re.search(r'<(.*)>','abevbv envvrhwkv <eiwbv> ebvi <wieunv> ajhbsvhj')

Browser other questions tagged

You are not signed in. Login or sign up in order to post.