Just complementing, the another answer does not take into account an important detail of the question: the characters }
should be at the end of the line. That is, if we have a text like this:
abc } def
blablabla
xyz }
The regex shall not consider }
first line, because it is not at the end of the line. But the solution proposed by the other answer considers this case also.
So that the regex considers only the }
at the end of the line, we can use the bookmark $
. For default it means "end of string", but many languages and Engines have a flag that makes it also match the end of the line. On the site regex101.com, for example, just activate the flag multiline (option m
in the upper right corner), so the regex would be }$(.|\n)*?}$
see the difference.
This flag is present in most languages, usually with the name of multiline: for example, we have this flag in PHP, in Python, in Java, in Javascript, etc. Search for the documentation of the language/tool you are using, most have this option.
Another detail is that the point, by default means "any character, except line breaks". So that it can also consider line breaks, you can use the proposed option in the other answer (.|\n
), but this option ignores Windows line breaks (\r\n
). A better alternative would be to enable flag DOTALL
(also called singleline, which is a somewhat confusing name, given its function, which is to make the point correspond to line breaks).
In regex101.com it is called singleline, and if I activate it, I can change the regex to }$.*?}$
, see. In PHP the option is s
(but your name is PCRE_DOTALL
), ditto in Python and Javascript, and in Java is only DOTALL
(although admits the syntax (?s)
within the expression, which also enables this flag).
Yes, in many languages flags can be enabled in regex itself. Check the documentation, but in this case the most common syntax is (?s)(?m)}$.*?}$
(the (?s)
enables the mode DOTALL
and the (?m)
enables the mode MULTILINE
- see here that the operation is the same).
Another alternative is to use }$[\s\S]*?}$
. The shortcut \s
considers spaces and line breaks, and \S
is "anything that is not \s
". So [\s\S]
is all that is \s
and all that is not \s
- ie, it is a trick to catch any character including line breaks. This way, you do not need the flag DOTALL
, see (is usually used in Engines that do not have this option).
How many lines between the }
?
The above regex takes as many lines as necessary until you find the second }
at the end of a line. But from the description, I understand that you really just want there to be a line between the two }
. In this case, the regex could be:
}(\r\n?|\n).*(\r\n?|\n).*}(\r\n?|\n|$)
Now I use (\r\n?|\n)
to consider Windows line breaks (\r\n
), of Macos (only the \r
, since the \n?
indicates that the \n
is optional), or only one \n
, which is the line break of Unix/Linux.
One detail is that so I don’t need the flags MULTILINE
and DOTALL
. So the regex now takes one }
followed by line break, then catch .*
(zero or more characters) followed by another line break, followed by .*
(zero or more characters), followed by }
, followed by another line break.
This way I guarantee that there will be only one line. Note that in this case, the flag DOTALL
must be switched off to prevent .*
take more than one line. And also note that at the end we have |$
, because in addition to line breaking, we can have the end of the string (for cases where the }
is the last character, and has no other line after). See here the regex working.
It can be kind of "tedious" to repeat several times the same expression of line breaks. In this case, there is the feature of subroutines - that some Engines support, refer to your language/tool documentation to see if it is possible to use it:
}(\r\n?|\n).*(?1).*}((?1)|$)
In this case, the parentheses form a catch group, and we can refer to them later with (?1)
(basically that means "use here the same expression that was used in the first capture group"). See here the regex working.
Some Engines still support named groups, which can help make regex a little more "readable" (or not, it’s a matter of opinion):
}(?<linebreak>\r\n?|\n).*(?&linebreak).*}((?&linebreak)|$)
(?<linebreak>
defines the group called linebreak
and (?&linebreak)
means "use the same group expression here linebreak
". See here her working.
Finally, some Engines de regex support the shortcut \R
, which corresponds to a line break (both the \r\n
as to the \n
or the \r
alone, among other characters - varies according to language). So you could also use something like }\R.*\R.*}(\R|$)
(see here), or }(\R).*(?1).*}((?1)|$)
(see here).
You can give an example of text and clarify which programming language?
– Sergio
it is necessary to specify where the regex will be used, depending on the place, the arguments change
– Paz