Regular expressions do not match the desired text snippet

Asked

Viewed 72 times

1

I’m needing my program to capture an item from a given text, but it’s not doing that, on the contrary, it’s capturing everything that comes after that.

Code I’m using, ex:

String html = "ItemPago12.569,00DeducoesPagas36.567,52ItensQnt6DeducoesRetidas21.354,11";
Pattern conteudo = Pattern.compile("ItemPago([^<]+)Deducoes");
Matcher match = conteudo.matcher(html);
match.find();

System.out.println(match.group(1));

Running program: https://ideone.com/JwFxu2

I need to get what’s in between: ItemPago and Deducoes. I would like examples and explanations of how to use this method correctly. Thank you.

  • Here your code works as expected: https://ideone.com/aPX7FP

  • @Article It works because it is an example, but if I change the check string and put one more word "Deducoes" for example, it takes everything: https://ideone.com/JwFxu2

  • It would then be interesting to edit and present the string that gives the error in the question, because the one that is not problems.

  • @Articuno I just edited, so the problem becomes more evident...

1 answer

3


There are three possible behaviors in regular expressions: Greedy, reluctant and possessive. What you want is reluctant behavior. You can use the .*?, where .* means to take something and the ? means reluctant.

Reluctant behavior tells the regular expression analyzer to be content with the first possibility of match, not trying anything else.

Here is the complete code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

class Ideone {
    public static void main(String[] args) {
        String html = "ItemPago12.569,00DeducoesPagas36.567,52ItensQnt6DeducoesRetidas21.354,11";
        Pattern conteudo = Pattern.compile("ItemPago(.*?)Deducoes");
        Matcher match = conteudo.matcher(html);
        match.find();

        System.out.println(match.group(1));
    }
}

Here’s the way out:

12.569,00

See here working on ideone.

  • Excellent answer, his problem is that he is searching greedily, just missing ? after the +! +1

  • It worked perfectly, they could give me some guide to use these properties more wisely?

  • 1

    @netoschneider In my case, are a few years of accumulated experience. But you can start by reading and sifting details here and here.

  • @Victorstafusa Thank you!!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.