What does the expression "/ (?:(?:cats?|dog):)? /" do?

Asked

Viewed 103 times

7

Doubt is as to the following regular expression:

/^(?:(?:gatos?|cachorro):)?/

In understanding I have the following:

  • Beginning: ^

  • ( ): This is a grouping, right? like (gato|cachorro|etc..)?

  • Because the first (?? - indicates something like "If there is ("?

  • At the end, there’s a block () with ?. Is it something like there’s a block?

  • This regex seems strange to me, (?: means that matches the pattern but does not capture or is an empty group. has a : lost here => ):)?. ? at the end means optional character, ie in the group can marry gato or gatos. There’s one at the end of the group ...

  • I found this:http://regexr.com/ , tested with: "/(?: (?: cats?| dog):)? /", and in the text field: I typed several data (cats, cat, cat:, cats:, other data, dog...etc), pulled ' from the beginning of the search, also gives to test using the "grep" in the terminal, only you have to remove the '/ /', and in the case of the '(' put '(', and '|', '|', very grateful, I have to study further...

2 answers

14

Let’s explain the regex:

  • /.../ - Those / are used in Javascript to denote that what is inside them is a regex. Therefore, the real regex is the ^(?:(?:gatos?|cachorro):)?.

  • ^ - String start. This means that whatever is found has to be found at the beginning of the string, not in the middle of it.

  • (?: ... ) - It is a no-capture group. It is only used to group subexpressions. Regex allows captures with the ( ... ), so that you can extract parts of the text that gives the match. The use of this ?: disables the capture when you have no interest in it or when it could mess up other parts where you want the capture to be made. In your case, you have no interest in capturing parts, only the whole.

  • gatos?|cachorro - That might be gato or gatos or cachorro. The s? means that the s may or may not appear. The | indicates alternatives. The (?: ... ) around serves to group it all so that the | know where the alternatives begin and where they end.

  • : (the last) - Means the character : even.

  • The last ? - means that whatever is before in the (?: ... ) may or may not appear. If it does not appear, match is given anyway.

This way, there are only four strings that this regex recognizes that must be anchored at the beginning of the string. They are:

  •  (emptiness)

  • gato:

  • gatos:

  • cachorro:

Note that the end of the string is not checked. Thus, the use of gato:blabla also gives a match. But as the start is checked, xgato: no match.

I imagine the original context of this is something like this:

\^(?:(?:https?|ftp):)?\

That is, it is something that has some relation to the verification whether a piece of text is a link (if it starts with http:, https: or ftp:). However, he continues to accept the case that there is none of this on account of the latter ?.

  • Very grateful for the explanation, I have to study further...

  • The group without capture is actually (?: ... ) if not for the : will generate an error.

  • @Guilhermelautert I had forgotten to type the :. Edited. Thanks for the warning.

6

Your regex seems to search for a 'field' in the string in the pattern:

gato:
gatos:
cachorro:

(?:) means that your regex should match this standard but the catch does not go to the group or is a group without capture.

? In the end it means that the capture of that character or group is optional as in: gatos? can marry gato or gatos

Related:

Meaning of ?: ?= ?! ?= ?! in a regex

  • Thank you all for your help, I will search the link.

  • Empty group would not be the correct term, as I remark in my reply, the empty group makes reference to nothing, I believe that the most correct would be "group not countable" despite being ugly.

  • I could also say, "no capture group", as mentioned in the other answer

  • @Guilhermelautert thanks for the suggestion I edited the reply.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.