Regular expression to validate a field that accepts CPF or CNPJ

Asked

Viewed 120,536 times

49

I have a <input> on the form that must accept CPF or CNPJ.

I will use in the back-end in PHP the value to save or in the field "Cpf" or in the field "cnpj" of a table in the database.

I’m using Angularjs and I need a regular expression to put in ng-pattern, to accept only usable data.

I do not care at this time to validate the check digits at the end.

I want to be a little flexible, allowing the user to use dots, dash and slash if they want, or just numbers.

4 answers

97


Solution:

This solution validates these formats: 00000000000, 00000000000000, 000,000,000-00, 00,000,000/0000-00 and up to 000000000-00 or 00000000/0000-00, for example. The dots and dashes are optional at each of the positions.

What is not accepted, for example: 000-000-000-00 (but can be changed as note below)

Regex:

([0-9]{2}[\.]?[0-9]{3}[\.]?[0-9]{3}[\/]?[0-9]{4}[-]?[0-9]{2})|([0-9]{3}[\.]?[0-9]{3}[\.]?[0-9]{3}[-]?[0-9]{2})

Click here and do your tests in the regexpal.

Explanation:

  • [0-9]{2} Character range: 0 to 9, number: 2 characters;
  • [0-9]{3} Character range: 0 to 9, number: 3 characters;
  • [0-9]{4} Character range: 0 to 9, number: 4 characters;
  • [\.]?One point, optional. It was used in the stitch as it alone is special character;
  • [-]? A dash, optional (if you add other characters, start with - always);
  • [\/]? A slider, optional. Also "escape" with \ to please the PCRE;
  • (grupo1)|(grupo2) If one of the groups validates, the expression is valid.

If you want to accept other separators, just add between the [ ].
Example: [-\.\/]? will accept so much - how much . and / in that position (? = or nothing).

To adapt to other "dialects" of regexp, some possible variations would take the escape from the bar (\/ => /) and optionally place a ^ at the beginning and a $ at the end of the line.

  • 4

    This question was asked only because, when searching on Google, the results presented were ghastly. I intended to post my reply right away, but you were hyper-quick to reply - and provided a good and correct answer (and without being copy/Paste search Google). Today, finally, I posted my answer, only reporting what was already implemented at the time of the question. I accept your answer as the correct one for merit, but here is the comment to the public that my answer, below, is equally OK. All right? Thank you.

  • 2

    And long live Stack Overflow... that reaches the top of the results in Google and people can access good answers! :-)

  • 2

    @J.Bruni when it is so leave a message in the comment of the question, so we have to know that already has answer triggered. (then I delete this comment here). Grateful for the consideration. In fact, I rode the regex at the same time, not even ventured to look, for being a very specific thing.

  • 2

    simpler: CNPJ_REGEXP=/^(\d{2}\.?\d{3}\.?\d{3}\/?\d{4}-?\d{2})$/

  • 1

    @Fernandofabreti tried to make it more didactic and legible, but it is good to have the version with d. (usually the parser turns the two verses into the same thing). If I edit the question later, I will add your duly credited suggestion.

  • 1

    Ahh, got it. Blz.

  • @Bacco I also suggest adding the and $ to avoid the match when the user type, for example, 12, 13 or 15 digits. In fact, I added a reply with look Behind and look Ahead, in case the CPF or CNPJ may appear in the middle of the text.

  • @Paulomerson I had already mentioned at the end of the answer. Then, if it is the case I elaborate better to highlight the point. Thanks for the tip, qq way.

  • 1

    not to mention that you have the option to answer the question itself as soon as you ask

Show 4 more comments

20

Regular expression to validate a field that accepts CPF or CNPJ (no calculation of the check digits):

/^([0-9]{3}\.?[0-9]{3}\.?[0-9]{3}\-?[0-9]{2}|[0-9]{2}\.?[0-9]{3}\.?[0-9]{3}\/?[0-9]{4}\-?[0-9]{2})$/

It can be understood as such (where "Cpf" is the expression to validate CPF and "cnpj" is the expression to validate CNPJ):

/^(cpf|cnpj)$/

The initial and final bars (/) are not part of the expression itself - they are only delimiters. The character ^ at the beginning and the character $ ultimately require the full content of the string to be validated matches the expression between them. Parentheses containing the vertical bar (a|b) create an alternative "option" between "a" and "b". Satisfying any of the two expressions, the result will be positive. Instead of "a" and "b", then we have the specific expressions for CPF and CNPJ separately.

For CPF:

[0-9]{3}\.?[0-9]{3}\.?[0-9]{3}\-?[0-9]{2}

The interrogation (?) causes the preceding character specification to be optional. So dots and dashboard are optional. The character class [0-9] represents any character of 0 to 9 (we could use \d, but I prefer [0-9] by being more readable). Finally, the number between brackets ({3}) determines a specific amount of times that the preceding character specification must be repeated. As such, a total of 11 numerical characters (3 + 3 + 3 + 2).

For CNPJ, the structure is similar:

[0-9]{2}\.?[0-9]{3}\.?[0-9]{3}\/?[0-9]{4}\-?[0-9]{2}

Here a total of 14 numeric characters are required (2 + 3 + 3 + 4 + 2).

Remembering that the backslash (\) before the point (.) and other special characters is an "escape" character, which serves to disregard the special interpretation of the following character and to consider it literally. (The point, without "escape", means "any character". With "escape", it means merely the "point" character itself.)


To find out if it is CPF or CNPJ

On the server side, in PHP, the selection is made between CPF or CNPJ considering the number of digits present in the field:

$numeros = preg_replace('/[^0-9]/', '', $valor);

if (strlen($numeros) == 11)
{
    $cliente->cpf = $valor;
}
elseif (strlen($numbers) == 14)
{
    $cliente->cnpj = $valor;
}

Observing: this does not replace the validation made by the regular expression we saw above, which is also performed on the server side (in my case the rules are embedded in the model, with the same regular expressions of validation that we saw above for CPF and CNPJ, only separated - each in its respective field).

1

For a simple input (allows typing only CPF or CNPJ):

^(\d{2}\.?\d{3}\.?\d{3}\/?\d{4}-?\d{2}|\d{3}\.?\d{3}\.?\d{3}-?\d{2})$

For an input or text box where the CPF or CNPJ may appear at the beginning, middle or end of a text:

(?<=\D|^)(\d{2}\.?\d{3}\.?\d{3}\/?\d{4}-?\d{2}|\d{3}\.?\d{3}\.?\d{3}-?\d{2})(?=\D|$)

The main difference is the use of "look Behind" and "look Ahead" to ensure that the digits of the CPF or CNPJ are preceded/succeeded by non-numeric characters. Without this, there would be the match of a sequence of, for example, 20 digits!

-5

E-mail: "^([-a-zA-Z0-9_-]*@(gmail|yahoo|ymail|rocketmail|bol|hotmail|live|msn|ig|globomail|oi|pop|inteligweb|r7|folha|zipmail).(com|info|gov|net|org|tv)(.[-a-z]{2})?)*$"

CPF: "^((\d{3}).(\d{3}).(\d{3})-(\d{2}))*$"

CNPJ: "^((\d{2}).(\d{3}).(\d{3})/(\d{4})-(\d{2}))*$"

CPF/CNPF:"^(((\d{3}).(\d{3}).(\d{3})-(\d{2}))?((\d{2}).(\d{3}).(\d{3})/(\d{4})-(\d{2}))?)*$"

Ex.: (VB.NET)

Dim hRegex As Regex = New Regex("^(((\d{3}).(\d{3}).(\d{3})-(\d{2}))?((\d{2}).(\d{3}).(\d{3})/(\d{4})-(\d{2}))?)*$", RegexOptions.None)

MsgBox(hRegex.IsMatch($"000.000.000-00")) 'Saída {True|False}
  • Fabius, edit your answer and to format the code press CTRL+K.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.