Take only the values before the character "=" using regular expression?

Asked

Viewed 4,221 times

13

I have a file containing the following contents:

Maringa=123123
Marialva=789789
Mandaguacu=123456
Cidadex=A341a2

How do I pick up only the characters before the =, using regular expression?

I tried it this way: .*= But then the String comes with the same char (=), I wish I could deny that char.

  • 2

    What language are you using?

  • 3

    What programming language are you using? Different programming languages implement regexes with slightly different syntaxes.

  • I was in need for Delphi, it worked. Thank you.

3 answers

15


You can capture regex groups after you match. Example:

(\w+)=(\w+)

So the first group will be your identifier and the second will be the value.

If you still prefer to do a regex that only recognizes the identifier you can use:

\w+(?==)

(?=algo) is called "Positive Lookahead". It is a way to look at the following text and confirm if it is algo. If it is not, fail. If it is, accept regex, but do not include it in the output. There are other variants like "Negative Lookahead" (?!algo) which does the opposite. Note that not every regular expression library supports this type of syntax.

  • 1

    Thank you very much, the second option worked. However I did not understand much what was done, could give a brief explanation?

  • @fymoribe Lookahead is an expression that needs to be married but is not considered as part of the married standard, for extraction/substitution purposes.

  • @fymoribe I added a little more detail.

  • @Guilhermebernal thank you very much for the explanation, it worked the implementation.

  • @Guilhermebernal In case the same is a token for anything? Wouldn’t it be right to use w in the case?

  • @Victor, I don’t understand the question. In case \w+(?==) will search for a word \w+ followed by an equal (?==). If it were followed by an arrow, for example, it would be (?=->).

  • @Guilhermebernal I understood! In the case I was understanding the ?= as a = optional, did not know of the existence and had not understood the Lookahead.

Show 2 more comments

13

Use ^[^=]*, which will recognize an uninterrupted sequence of different characters from = at the beginning of the line. Parts:

  • ^: acknowledges the start of the line;
  • [^=]: recognizes a character other than = (at the beginning of an expression with brackets, the circumflex accent, ^, represents denial);
  • *: acknowledges the above expression ([^=]) as many times as possible.

Without the first part (^), regular expression produces two results: the text before the = and the text after the =. If you are running the regular expression on each line and picking only the first result, the ^ start is optional and the expression can be simplified to [^=]*.

  • I had answered [^=]*, but I deleted my answer because I think the question is actually another: "But the String comes with the same char (=), I wish I could deny that char"

  • Thank you very much for the answer, next time I try to write my question better.

  • 1

    @fymoribe If you think you can make this question clearer, please edit (there is an "edit" link below the question). This question can be useful for many people in the future

  • @bfavaretto, in fact [^=]* does not bring the same char (=).

  • 1

    @I will abstain, regex is not my strong suit.

  • 2

    This answer deserves more votes, not pq the other answer would be wrong, but pq this answer works in most regex variations and teaches a general pattern that you can use for similar problems (not all variations have the w...).

Show 1 more comment

2

Depending on what you want to do, you might want to use the split to break the string into two.

Javascript:

var linha     = "Maringa=123123";
var resultado = linha.split("=');

$resultado is now an array containing ['Maringa', '123123'];

Browser other questions tagged

You are not signed in. Login or sign up in order to post.