Regex for repeat numbers on CNPJ

Asked

Viewed 1,269 times

4

I have the following regular expression:

regex:/^\d{2}\.\d{3}\.\d{3}\/\d{4}\\-\d{2}$/

I can validate for that, but not for repeated numbers.

I want to apply within this regex, a form that does not accept repeated values, for example: 11.111.111/1111-11, 22.222.222/2222-22 so on and so forth.

I’m using this Regex inside a Laravel Request.

public function rules()
    {
        return [
            'name'                  =>  'required:unique:companies',
            'email'                 =>  'required|email|unique:companies',
            'cnpj'                  =>  'required|unique:companies|regex:/^\d{2}\.\d{3}\.\d{3}\/\d{4}\\-\d{2}$/',
            'display_name'          =>  'required',
            'description'           =>  'string',
            'address'               =>  'required|string',
            'address_number'        =>  'required|numeric',
            'district'              =>  'required',
            'zip_code'              =>  'required|min:9',
            'city_id'               =>  'required',
            'site_url'              =>  'required',
            'photo_url'             =>  'required|image',
            'phone_number'          =>  'required|min:10',
        ];
    }

How would you do that?

  • 1

    I state the following repository to be installed in your Laravel project: https://github.com/geekcom/validator-docs - The main advantage of this specific project, that you can validate the CPF or CNPJ in the same field. Follows: $this->validate($request, [ 'cpf_or_cnpj' => 'formato_cpf_cnpj|cpf_cnpj', ]); * Excellent documentation. Source: How to implement a validation rule in Laravel

  • 1

    Yeah, I saw it, staff recommended it here, was seeing it, it’s great for that, and it’s still pretty easy to use within the application.

1 answer

4


Short answer

^(?!(\d)\1\.\1{3}\.\1{3}\/\1{4}-\1{2}$)\d{2}\.\d{3}\.\d{3}\/\d{4}-\d{2}$

I’m not sure the hyphen needs to be escaped with \\, as you did. If you need to, just change the regex to:

^(?!(\d)\1\.\1{3}\.\1{3}\/\1{4}\\-\1{2}$)\d{2}\.\d{3}\.\d{3}\/\d{4}\\-\d{2}$

Long answer

First we have the markers ^ and $, which means, respectively, the beginning and the end of the string. With this I guarantee that the entire string has only what is inside the regex.

After the ^ (string start), regex has 2 main parts. Let’s see separately how each one works.


The first section in parentheses (?!...) is a Negative Lookahead. Basically, it checks whether the string does not correspond the expression that is within the parentheses.

The first thing we have on Lookahead is (\d). The shortcut \d corresponds to the digits, and the parentheses form a capture group. This means that if the first character is a digit, it will be "captured" by regex. And since it is the first pair of parentheses, it will be referred to as group 1 (the Lookahead does not count because it alone does not form a capture group).

Then I use \1, which is a way of referencing group 1. This means that \1 will have the same value as the digit that was captured in group 1. That is, (\d)\1 checks if there are two digits in a row and if they are the same digit.

Next we have \., which corresponds to the dot character itself (.), and then we have \1{3}, which means "exactly 3 occurrences ({3}) of what was captured in group 1 (i.e., the digit we captured in (\d))".

The rest of the expression (\.\1{3}\/\1{4}-\1{2}$) checks for another point, plus 3 occurrences of the same digit, bar, 4 occurrences of the same digit, hyphen and 2 occurrences of the same digit, and finally the end of the string ($).

That is, the whole expression checks if the same digit repeats (it corresponds to cases like 11.111.111/1111-11 and 22.222.222/2222-22). And the Negative Lookahead ((?!...)) ensures that the string nay has that format. Therefore, if all digits are equal, the Lookahead fails and regex does not find a match.

The trick of Lookahead is that first it checks the string and if it is ok, it go back to where you were and continues to evaluate the rest of the expression. As the Lookahead is just after the ^ (string start), that is to say that it goes back to the beginning of the string and continues to evaluate the rest of the regex. If the Lookahead fail, regex also fails and finds no match.


The second part is the regex you were already using (2 digits, dot, 3 digits, dot, 3 digits, bar, 4 digits, hyphen, 2 digits and end of string).

The combination of Lookahead with your expression ensures that you have what you need:

  • the Negative Lookahead ensures that the digits are not all the same
  • if the verification of the Lookahead worked (i.e., does not fall in cases where all digits are equal), it goes back to where it was (in this case, the beginning of the string) and checks the rest of the expression
  • the rest checks if it is in the format you specified
  • 1

    I had just seen a little about that \1

  • @Romulosousa Yes, it is very useful for cases like this, where you need to check if the same value repeats

  • 1

    I know a little regex, but it is very useful for several things, but its explanation is very good.

  • When I try to register, I get this error. No ending delimiter '^' found

  • I will confess that I do not understand much about the Laravel, but in a quick search I found this and this link. Failed to put the / before and after (it is a delimiter that some languages use, to say that everything within it is a regex, but is not part of the regex itself).

  • 1

    It worked, it was the quotes " " that was missing.

  • 1

    Thanks man! Thank you so much!

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.