You can use the following regex:
\b\d{6}\b
The \d
is the meta character for digit, looking for a digit from 0 to 9.
The {6}
indicates that you must find 6 of these in a row.
The \b
indicate that it is beginning and end of word, which implies that it has to be the 6 numbers in a row, making 7 or more numbers no longer valid.
See working on regex101
It works perfectly for the case you mentioned. If you need something broader that works even with the numbers followed by the text, you can use this regex, for example, which is already a little more complex:
^|\D(\d{6})\D|$
Explanation:
^|\D - Inicio de linha ou algo que não seja um digito
(\d{6}) - 6 dígitos seguidos e a serem capturados no grupo de captura 1
\D|$ - Seguido de algo que não seja um digito ou fim de linha
Here I also used the meta character \D
which is the reverse of \d
, and it means something other than a digit.
A subtle difference from this regex to the previous one is that in this one the digit is in the capture group 1, instead of in the complete capture of the regex. So no matter how you use regex in code, the number will always be different.
See in regex101
In regex territory it’s always like this. The simplest general rule is more strict and does not control all cases, but sometimes they are exactly what you need. Always try to find a balance between what you want to validate and the complexity/efficiency you are willing to use/give up.
The more I study regex, the more I realize that what you said in the last paragraph is true: finding the balance between accuracy and complexity is fundamental - and it is often difficult to find this equilibrium point. If the system allowed, I would give one more vote just for this comment :-) By the way, another alternative is to use lookaheads and negative lookbehinds:
(?<!\d)(\d{6})(?!\d)
- see– hkotsubo
@hkotsubo It is so. There are many cases where it simply doesn’t compensate for the complexity and loss of efficiency to control one or another case that almost never happens, with monstrous regexs like I’ve seen some. Interestingly I started by doing exactly this regex with negative Lookahead, but I ended up switching to what I have in the answer because I thought it was simpler. In addition some Engines are more restrictive in lookbehind
– Isac