Short answer
^(?!(\d)\1\.\1{3}\.\1{3}\/\1{4}-\1{2}$)\d{2}\.\d{3}\.\d{3}\/\d{4}-\d{2}$
I’m not sure the hyphen needs to be escaped with \\
, as you did. If you need to, just change the regex to:
^(?!(\d)\1\.\1{3}\.\1{3}\/\1{4}\\-\1{2}$)\d{2}\.\d{3}\.\d{3}\/\d{4}\\-\d{2}$
Long answer
First we have the markers ^
and $
, which means, respectively, the beginning and the end of the string. With this I guarantee that the entire string has only what is inside the regex.
After the ^
(string start), regex has 2 main parts. Let’s see separately how each one works.
The first section in parentheses (?!...)
is a Negative Lookahead. Basically, it checks whether the string does not correspond the expression that is within the parentheses.
The first thing we have on Lookahead is (\d)
. The shortcut \d
corresponds to the digits, and the parentheses form a capture group. This means that if the first character is a digit, it will be "captured" by regex. And since it is the first pair of parentheses, it will be referred to as group 1 (the Lookahead does not count because it alone does not form a capture group).
Then I use \1
, which is a way of referencing group 1. This means that \1
will have the same value as the digit that was captured in group 1. That is, (\d)\1
checks if there are two digits in a row and if they are the same digit.
Next we have \.
, which corresponds to the dot character itself (.
), and then we have \1{3}
, which means "exactly 3 occurrences ({3}
) of what was captured in group 1 (i.e., the digit we captured in (\d)
)".
The rest of the expression (\.\1{3}\/\1{4}-\1{2}$
) checks for another point, plus 3 occurrences of the same digit, bar, 4 occurrences of the same digit, hyphen and 2 occurrences of the same digit, and finally the end of the string ($
).
That is, the whole expression checks if the same digit repeats (it corresponds to cases like 11.111.111/1111-11
and 22.222.222/2222-22
). And the Negative Lookahead ((?!...)
) ensures that the string nay has that format. Therefore, if all digits are equal, the Lookahead fails and regex does not find a match.
The trick of Lookahead is that first it checks the string and if it is ok, it go back to where you were and continues to evaluate the rest of the expression. As the Lookahead is just after the ^
(string start), that is to say that it goes back to the beginning of the string and continues to evaluate the rest of the regex. If the Lookahead fail, regex also fails and finds no match.
The second part is the regex you were already using (2 digits, dot, 3 digits, dot, 3 digits, bar, 4 digits, hyphen, 2 digits and end of string).
The combination of Lookahead with your expression ensures that you have what you need:
- the Negative Lookahead ensures that the digits are not all the same
- if the verification of the Lookahead worked (i.e., does not fall in cases where all digits are equal), it goes back to where it was (in this case, the beginning of the string) and checks the rest of the expression
- the rest checks if it is in the format you specified
I state the following repository to be installed in your Laravel project: https://github.com/geekcom/validator-docs - The main advantage of this specific project, that you can validate the CPF or CNPJ in the same field. Follows:
$this->validate($request, [ 'cpf_or_cnpj' => 'formato_cpf_cnpj|cpf_cnpj', ]);
* Excellent documentation. Source: How to implement a validation rule in Laravel– Otto
Yeah, I saw it, staff recommended it here, was seeing it, it’s great for that, and it’s still pretty easy to use within the application.
– Romulo Sousa