How to locate 2 matches that are interspersed with the same regex

Asked

Viewed 79 times

1

In the following string:

83/80/95

I need you to find 83/80 and 80/95, using the same regex.
I am using the following regex:

(\d{2,4})[\/](\d{2,4})

The first Match "83/80" is found, but the second is not.

1 answer

1

I believe that there is no way to do this with a single regex processing the string at once (that is, with a single call to method Match), as the regex will evaluate from left to right, always advancing in the string. Once found a match, the next will be sought from the position after the last found, that is, after finding the 80, she will begin the search from /95.

Then the way is to "cut" the original string, eliminating only the part before the bar, and search for regex in this "cut string":

string s = "83/80/95";
var regex = new Regex(@"\d{2,4}/\d{2,4}");
while (s.Length >= 5)
{
    Match m = regex.Match(s);
    if (m.Success)
    {
        Console.WriteLine(m.Value);
        s = s.Substring(s.IndexOf('/') + 1);
    }
    else
    {
        break;
    }
}

That is, in this loop i search and print match found. Then I use Substring to pick up the string snippet after the first bar, and search again at this substring.

In the first iteration, the search will be done in the whole string, and will be found 83/80. Next, Substring will return the entire chunk after the first bar (i.e., now the string is "80/95", and in the next iteration regex will search this string, finding 80/95.

When regex finds no more match (or when the substring found is smaller than 5), the loop is stopped. I put the condition while (s.Length >= 5) because regex needs at least 5 characters to find a match (\d{2,4} is "at least 2 and at most 4 digits", so the minimum string needs is 5 characters: 2 digits, one bar plus 2 digits - if it’s less than 5 characters, I don’t need to try using regex and I can already stop loop).

The output of the code is:

83/80
80/95

Also note that in regex the bar does not need to be written as [\/], can simply write /. The brackets define a character class: for example, [ab] means "the letter a or the letter b" (any of them), then [\/] means "the character /", that you swap for your own character / (there is no gain in using the brackets in this case, so you can remove them).

I also took the parentheses around the \d{2,4}, since they form a capture group, and how you want the whole match (and not just a group), they are not needed either.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.