Why does the Matcher class not return the number of groups correctly?

Asked

Viewed 74 times

3

After the question "What is the difference in use between the Matcher() and find() methods?", I kept testing this class to understand its functioning, but I came across a strange behavior.

When trying to identify the number of groups found in the string by a given regular expression, the return is always 0, even if there are occurrences of the ER in the string.

In the example below(online):

String text = "um2tres4cinco6sete8";

String regex = "[0-9]";

Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(text);

while(m.find()){
    System.out.println(m.group());
}
System.out.println("Total de grupos: " + m.groupCount());

The return is:

2  
4  
6  
8  
Total de grupos: 0 

In the regex101 is also displayed in this way.

According to the method documentation groupCount():

public int groupCount()
Returns the number of capturing groups in this matcher’s Pattern.

If the function of this method is to return the total of captured groups, why does it return 0 and not 4 in this example? Or am I misinterpreting something of this method?

P.S.: If possible, I would like an explanation with examples.

1 answer

4


The result is zero as there is no capturing group in its regular expression. See documentation for Pattern:

(X) X, as a capturing group

but "[0-9]" does not contain any part between '(' and ')'.

Also note the documentation of group(), not to be confused with the method group(int):

Returns the input subsequence Matched by the Previous match.

i.e., returns what was found, not (necessarily) what corresponds to one of the groups.

With your example, if the squeeze were "([a-z]*)([0-9]*)", the results of the first find would be:

  • group = "um2" - excerpt found
  • groupCount = 2 - number of groups 1:([a-z]*), 2: ([0-9]*)
  • group(1) = "um"
  • group(2) = "2"
  • If not, how find found 4 occurrences?

  • I still don’t understand the difference.

  • Now I get it. What a mess Sun/Oracle, these names then only lead to those who are learning to get confused.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.