Just get the part after the _
be optional:
Regex r = new Regex(@"\{\{[^\}]+?\}\}(_\(\d+\))?");
I’m assuming the ID is numerical, so I used \d+
(one or more digits). The parentheses around the number must be escaped with \
, and around all this I put parentheses to group everything and leave this whole stretch optional, using ?
(the interrogation makes the whole stretch (_\(\d+\))
optional).
If the ID can have letters and numbers, an alternative is to exchange \d
for \w
:
Regex r = new Regex(@"\{\{[^\}]+?\}\}(_\(\w+\))?");
Although the shortcut \w
also considers the character _
, then strings as ___
and __1__
will be considered valid. If you do not want _
, can change to:
Regex r = new Regex(@"\{\{[^\}]+?\}\}(_\([a-zA-Z\d]+\))?");
To character class [a-zA-Z\d]
considers letters of a
to z
(upper and lower case), plus digits (\d
). But this regex does not consider accented letters, and in this case you could still use:
Regex r = new Regex(@"\{\{[^\}]+?\}\}(_\([\p{L}\d]+\))?");
The shortcut \p{L}
considers all characters defined by Unicode, which are in the "Letter" categories (all from this list, beginning with "L"), that is, in addition to the accented letters, it also considers letters from other alphabets (Arabic, Japanese, Cyrillic, etc).
Anyway, there are several options and which to use depends a lot on how your data is. If you know, for example, that there are no cases like __1__
and all Ids are valid, use only \w
(or [a-zA-Z\d]
, if there are no letters with accent) may be enough.
Another detail is that in the excerpt [^\}]+?
you do not need the interrogation. In this case it serves to leave the quantifier +
"lazy", but how are you seeking to [^\}]
(anything that is not }
) and then there’s the character itself }
, there is no risk of regex going beyond what is necessary (which is one of the main reasons to use lazy quantifiers).
So this ?
shortly after the +
can be removed:
Regex r = new Regex(@"\{\{[^\}]+\}\}(_\(\d+\))?");
I put it as it always was numeric, but it can be character too, can show me how it would look?
– LeoHenrique
@Leohenrique Updated the answer
– hkotsubo
thanks for the full explanation :)
– LeoHenrique
I need some help, I have another scenario, I may have, in addition to this question, this case
(123,abc,456)
, can you tell me how you would do it? (can be mixed letters and numbers)– LeoHenrique
@Leohenrique Talvez
@"\{\{[^\}]+\}\}(_\(\w+(,\w+)*\))?"
already serve. Or@"\{\{[^\}]+\}\}(_\((\d+|[a-z]+)(,(\d+|[a-z]+))*\))?"
- I haven’t tested, but it should solve– hkotsubo
I used this:
@"\{\{[^\}]+\}\}(_\(\w+(,\w+)*\))?"
and it worked... Thank you– LeoHenrique