How to validate and calculate the control digit of a CPF

Asked

Viewed 11,991 times

2

How does the algorithm that calculates the digit of a CPF (Cadastro da Pessoa Física Brasileiro) work? And how is this calculation used to validate the CPF? If possible, I would like examples in Python.

  • You can validate Cpf/cnpj using the code of this repository on github :https://github.com/rafahlobo/cpfValidator

5 answers

8

A solution more pythonica would be:

def validate(cpf: str) -> bool:

    """ Efetua a validação do CPF, tanto formatação quando dígito verificadores.

    Parâmetros:
        cpf (str): CPF a ser validado

    Retorno:
        bool:
            - Falso, quando o CPF não possuir o formato 999.999.999-99;
            - Falso, quando o CPF não possuir 11 caracteres numéricos;
            - Falso, quando os dígitos verificadores forem inválidos;
            - Verdadeiro, caso contrário.

    Exemplos:

    >>> validate('529.982.247-25')
    True
    >>> validate('52998224725')
    False
    >>> validate('111.111.111-11')
    False
    """

    # Verifica a formatação do CPF
    if not re.match(r'\d{3}\.\d{3}\.\d{3}-\d{2}', cpf):
        return False

    # Obtém apenas os números do CPF, ignorando pontuações
    numbers = [int(digit) for digit in cpf if digit.isdigit()]

    # Verifica se o CPF possui 11 números ou se todos são iguais:
    if len(numbers) != 11 or len(set(numbers)) == 1:
        return False

    # Validação do primeiro dígito verificador:
    sum_of_products = sum(a*b for a, b in zip(numbers[0:9], range(10, 1, -1)))
    expected_digit = (sum_of_products * 10 % 11) % 10
    if numbers[9] != expected_digit:
        return False

    # Validação do segundo dígito verificador:
    sum_of_products = sum(a*b for a, b in zip(numbers[0:10], range(11, 1, -1)))
    expected_digit = (sum_of_products * 10 % 11) % 10
    if numbers[10] != expected_digit:
        return False

    return True
    

The implemented logic is exactly the same as described in the other answers, but makes use of language tools to simplify the solution.

  • Commenting on an old answer, but the observation is pertinent. You do not need to test len(numbers) != 11 because that will never happen, the regex of if above only allows a string with 11 numbers and in the specific format. It is redundant. It was that dindo :D

  • @fernandosavio I believe that on the day I found that not having the and $ in regex could catch the CPF in the middle of the string. My bad xD

  • The observation of ^$ it is good, can give problem even. Better to leave as it is, I will leave my comment, if light some being ground.

  • 1

    match always search at the beginning of the string then if the CPF is in the middle of the string, will only be found if you use search. @fernandosavio Of any shape, the regex without $ can pick up extra digits at the end. So either use ^ and $, or use the if len == 11 (but I prefer to use $ Then you test the size once)

  • 1

    Your solution has become more beautiful len(set(numbers)) good! Thanks @Woss

6

How the algorithm that calculates the digit of a CPF works (Cadastro da Pessoa Física brasileiro)?

As Wilson Neto response, there is an explanation on this link. Module 11 is basically applied in a 9-digit number to generate the first check digit. The second digit checker is generated from the original 9 numbers plus the first digit checker.

And how this calculation is used to validate the CPF?

The validation of the CPF is identical to the generation of the CPF. From the first 9 numbers, the two check digits are generated. If they are equal to the input provided, the CPF is valid.

If possible, I would like examples in Python.

Follow code for CPF validation in Python.

The original link is here.

#!/usr/bin/env python
#Djames Suhanko
import sys
try:
 cpflimpo=sys.argv[1]
except IndexError:
 print "Use %s NUMERO_DO_CPF" % sys.argv[0]
 sys.exit()

if (len(cpflimpo) != 11 or not cpflimpo.isdigit()):
 print "Formato errado. Tente de novo (apenas numeros)"
 sys.exit()

digito = {}
digito[0] = 0
digito[1] = 0
a=10
total=0
for c in range(0,2):
 for i in range(0,(8+c+1)):
  total=total+int(cpflimpo[i])*a
  a=a-1
 digito[c]=int(11-(total%11))
 a=11
 total=0
if (int(cpflimpo[9]) == int(digito[0]) and int(cpflimpo[10]) == int(digito[1])):
 print "CPF valido: ",
 for i in (range(len(cpflimpo))):
   if (i == 2 or i == 5):
    sep=cpflimpo[i]+" ."
   elif (i == 8):
    sep=cpflimpo[i]+" -"
   else:
    sep=cpflimpo[i]
   print "%s" % sep,
else:
 print "CPF invalido"

2

Here is an example that can be used both on the command line and as a Python library, with doctests.

import re


def validar_cpf(cpf):
    """
    Retorna o CPF válido sanitizado ou False.

    # CPFs corretos
    >>> validar_cpf('123.456.789-09')
    '12345678909'
    >>> validar_cpf('98765432100')
    '98765432100'
    >>> validar_cpf(' 123 123 123 87 ')
    '12312312387'

    # CPFs incorretos
    >>> validar_cpf('12345678900')
    False
    >>> validar_cpf('1234567890')
    False
    >>> validar_cpf('')
    False
    >>> validar_cpf(None)
    False
    """
    cpf = ''.join(re.findall(r'\d', str(cpf)))

    if not cpf or len(cpf) < 11:
        return False

    antigo = [int(d) for d in cpf]

    # Gera CPF com novos dígitos verificadores e compara com CPF informado
    novo = antigo[:9]
    while len(novo) < 11:
        resto = sum([v * (len(novo) + 1 - i) for i, v in enumerate(novo)]) % 11

        digito_verificador = 0 if resto <= 1 else 11 - resto

        novo.append(digito_verificador)

    if novo == antigo:
        return cpf

    return False


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 2:
        print("Uso: {} [CPF]".format(sys.argv[0]))
    else:
        cpf = validar_cpf(sys.argv[1])
        print(cpf if cpf else "CPF Inválido")

0

Below is an example of code taken from the library Bradocs4py:

import re

from itertools import chain

class ValidadorCpf(object):

    def __validarCpf(self, arg):  # type: (CPF) -> bool
        return self.__validarStr(arg.rawValue)

    def __validarStr(self, arg):  # type: (str) -> bool

        if arg == None:
            return False

        p = re.compile('[^0-9]')
        x = p.sub('', arg)

        if len(x) != 11 or len(set(x)) == 1: return False

        return all(self.__hashdigit(x, i + 10) == int(v) for i, v in enumerate(x[9:]))


    def __hashdigit(self, cpf, position):  # type: (str, int) -> int
        """
        Will compute the given `position` checksum digit for the `cpf` input. The input needs to contain all
        elements previous to `position` else computation will yield the wrong result.
        """

        val = sum(int(digit) * weight for digit, weight in zip(cpf, range(position, 1, -1))) % 11

        return 0 if val < 2 else 11 - val

    @staticmethod
    def validar(arg):  # type: (CPF) -> bool or  type: (str) -> bool
        v = ValidadorCpf()

        if type(arg) == CPF: return v.__validarCpf(arg)

        if type(arg) == str: return v.__validarStr(arg)

        return False

-4

Library for CPF or CNPJ validation using check digits for conference.

validdocbr

pip install validdocbr
pip3 instal validdocbr

Utilizing

from validdocbr import validdocbr

validador = validdocbr.validdocbr()

validador.cpf('12345678912')   
validador.cnpj =('12345678904321')

Returns True if document is valid or False if invalid.

https://pypi.org/project/validdocbr/

  • 1

    The author of the question does not simply want to validate a CPF he wants to know how the validation algorithm works so he can validate a CPF without relying on a third-party package or module

Browser other questions tagged

You are not signed in. Login or sign up in order to post.