Regex to find occurrences of one word before the other

Asked

Viewed 1,729 times

7

I need to find in the system sources situations where Taction occur after Tdxbar.

This will happen in different lines, but within the same file, for example:

pnlVisao: TPanel;
tbToolBar: TdxBar;
btnCancelar: TButton;
actAbrir: TAction;

In this case the action is declared after Toolbar.

I tried to use /(TdxBar)(?=TAction)/g but it didn’t work.

  • In what language will you use it?

  • It would rotate in the sublime or in the Atom. It is to search to perform a correction

3 answers

11


What I searched for has a flag to use in Sublime that makes regex multiline.

Try with the Regex:

(?s)TdxBar.*?TAction

The (?s) tells Sublime that the regex is multiline causing the . also marry with line breaks.

Source: https://stackoverflow.com/questions/11992596/regex-in-sublime-text-match-any-character-including-newlines


Edit (2018-09-19)

I ran into explanations on the Internet recently and decided to supplement this answer.

This post in the Sublime Question forum about a new engine to regex implemented in the software. The creator of the package control responds that the new engine is used to syntax Highlight and reading the post one discovers that the library Omniguruma is used in Sublime Text.

Entering the repository, I found where the flag is documented (?s):

+ ONIG_SYNTAX_PERL and ONIG_SYNTAX_JAVA
    (?s): dot (.) also matches newline
    (?m): ^ matches after newline, $ matches before newline
  • Perfect, thank you

  • cool! + 1 ......

  • I didn’t know the Sublime could handle the (?s), I used to use the regex which I suggested below :-)

  • 1

    I was also surprised when I found out at the time. I never needed to use, I found searching for this question. : D

3

If the intention is to find the list of files containing that pattern suggested:

grep  -zPl 'TdxBar(.|\n)*TAction' *
  • -z The -z (null separated Records) option causes the file to be loaded as if it were a single line, so regular expressions can be multilingual
  • -l only name files that contain default occurrences
  • -Pextended regular expressions in Perl syntax

1

Complementing the response of the fernandosavio, another alternative is to use regex:

TdxBar[\s\S]+TAction

Instead of the point, I use [\s\S].

The brackets define a character class and match any character within them. For example, [ab] means "the letter a or the letter b".

In case, inside the brackets we have:

Therefore, [\s\S] is "anything that is space, line breaks, etc, or anything other than this". In short, this ends up corresponding to any character, including line breaks.

In the end, it ends up having the same effect of using . along with the flag (?s) (the point, by default, corresponds to any character except line breaks, and the flag (?s) changes this behavior, making it also match line breaks).

I also use the quantifier +, meaning "one or more occurrences". This makes it necessary to have at least one character between "Tdxbar" and "Taction".

If you use * (zero or more occurrences), regex will also consider cases where there is nothing between them (i.e., if there is "Tdxbartaction" in the text, the regex with * takes, but the with + no, the + requires there to be at least one character between them).


By default, quantifiers such as * and + sane greedy and try to pick up as many characters as possible. This means that if there are two occurrences of "Tdxbar" and "Taction", the regex will take the whole stretch between the first "Tdxbar" and the last "Taction"".

If you know that they only occur at most once in the text, and if you just want to know if one is before the other, there is no problem. But in case you want the regex to take all the sections separately, just cancel the greed using a ? after the +:

TdxBar[\s\S]+?TAction

Thus, the quantifier becomes lazy and tries to pick up as few characters as possible. See the difference: here the lazy quantifier finds 2 pouch, already here the greedy version finds only one match.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.