Code revision of the CPF class in Java

Question

Code revision of the CPF class in Java

Asked 5 years, 2 months ago

Viewed 1,221 times

2

You can review the code of my CPF class?

A point of criticism is the use of regex which may be unnecessary. But in my view it makes the code shorter.

Another point is the adoption of a class for this, could be calculations done alone. But I think encapsulated in a class gets better.

It’s just a theoretical class, it’s been tested and it works, but it wasn’t used in production. Accepts valid CPF’s or no checker digit, and generates random CPF’s.

Cpf.java

import java.util.concurrent.ThreadLocalRandom;

public class Cpf {

    public static final int NRO_DIGITOS_SEM_DV = 9;
    public static final int NRO_DIGITOS_VERIFICADORES = 2;
    public static final int NRO_DIGITOS_COM_DV = NRO_DIGITOS_SEM_DV + NRO_DIGITOS_VERIFICADORES;

    private final String cpf;

    /** Recebe uma cadeia completa de dígitos de CPF, sem formatação */
    public Cpf(String cpf) {
        if (cpf == null) {
            throw new IllegalArgumentException("CPF não pode ser nulo.");
        } else if (ehCadeiaDeNoveDigitosNaoIniciadaEmZero(cpf)) {
            this.cpf = cpf + moduloOnze(cpf);
        } else if (ehCadeiaDeOnzeDigitos(cpf) && isValido(cpf)) {
            this.cpf = cpf;
        } else {
            throw new IllegalArgumentException("Número de CPF inválido: " + cpf);
        }
    }

    private static boolean ehCadeiaDeNoveDigitosNaoIniciadaEmZero(String cadeia) {
        return cadeia != null && cadeia.matches("[1-9]\\d{" + (NRO_DIGITOS_SEM_DV - 1) + "}");
    }

    private static boolean ehCadeiaDeOnzeDigitos(String cadeia) {
        return cadeia != null && cadeia.matches("[1-9]\\d{" + (NRO_DIGITOS_COM_DV - 1) + "}");
    }

    private static boolean isValido(String cpf) {
        return ehCadeiaDeOnzeDigitos(cpf)
                && cpf.substring(NRO_DIGITOS_SEM_DV, NRO_DIGITOS_COM_DV).
                        equals(moduloOnze(cpf.substring(0, NRO_DIGITOS_SEM_DV)));
    }

    /** Recebe uma cadeia de 9 dígitos decimais e retorna o módulo-11 dessa cadeia */
    private static String moduloOnze(String digitos) {
        if (false == ehCadeiaDeNoveDigitosNaoIniciadaEmZero(digitos)) {
            throw new IllegalArgumentException("Dígitos de CPF inválidos: " + digitos);
        }

        int dv1 = calcularDigitoVerificador(digitos);
        int dv2 = calcularDigitoVerificador(digitos + dv1);

        return String.valueOf(dv1) + String.valueOf(dv2);
    }

    private static int calcularDigitoVerificador(String digitos) {
        int peso = digitos.length() + 1;
        int dv = 0;
        for (int i = 0; i < digitos.length(); i++) {
            dv += (digitos.charAt(i) - '0') * peso;
            peso--;
        }

        dv = 11 - (dv % 11);

        return dv > 9 ? 0 : dv;
    }

    /** Exibe o CPF em formato 999.999.999-99 */
    public String formatado() {
        return cpf.substring(0, 3) + "." +
                cpf.substring(3, 6) + "." +
                cpf.substring(6, NRO_DIGITOS_SEM_DV) + "-" +
                cpf.substring(NRO_DIGITOS_SEM_DV, NRO_DIGITOS_COM_DV);
    }

    /** Exibe o CPF em formato 99999999999 (onze dígitos decimais) */
    @Override
    public String toString() {
        return cpf;
    }

    public String digitosVerificadores() {
        return cpf.substring(NRO_DIGITOS_SEM_DV);
    }

    public static Cpf criarAleatorio() {
        String noveDigitosAleatorios =
                String.valueOf(ThreadLocalRandom.current().nextInt(100_000_000, 999_999_999 + 1));
        return new Cpf(noveDigitosAleatorios + moduloOnze(noveDigitosAleatorios));
    }

    public static void main(String [] args) {
        Cpf cpf = new Cpf("526265690"); // Deve ter DV's 03
        System.out.println(cpf.formatado());
        System.out.println(cpf.digitosVerificadores());
    }
}

Is this code short? So... :) What I’ve seen of a problem is the inconsistency of is and eh, I would wear é :I have a thing when I see one false ==. But you didn’t use Regex for calculation which was what was being discussed in the chat. Used for validation, then it even makes sense, although I would not use (I would do in a faster line and in my opinion more readable), but it is not strictly wrong to use. The question is whether you need to do this validation alone, if you don’t need to, then you’re using something unnecessary. Huum, I just saw, it’s all private, so I think it’s exaggerated.

– Maniero

2020/05/20 at 18:43
It looks like modularization without any gain. The main function I do in 4 lines :P, as long as I don’t put anything else in it like validation. Why did the formatting not do with Regex? P

– Maniero

2020/05/20 at 18:48
The main function you speak is the calculation of the digit right? This I took from the internet, I kept a safe, I do not know how it would be shorter. Show there :P did not understand the problem that occurs in being all private.

– Piovezan

2020/05/20 at 19:06
It’s just that what we were talking about in the chat, you validate and then you create the digit, you can do it at the same time. At first I thought it was a method to be used by the consumer of the class.

– Maniero

2020/05/20 at 19:09
Perhaps I should have asked myself "what is really necessary of responsibility to the class?" Before you do it, it could have changed her face, what do you think? I did on the achometer, not on the actual requirement. I have never worked on anything that I needed to do a class to treat CPF :P

– Piovezan

2020/05/20 at 20:09
"It looks like modularization without any gain" I agree, from what I understood the mistake was to want to modularize a candidate entity that has no behavior (responsibility) and may well be a field of another entity. So it turns out that just because an algorithm is associated with a given doesn’t mean they should always be together, is that it? Where should CPF be validated then, in the business model? In a validator class? Or within the entity that contains it?

– Piovezan

2020/07/22 at 17:55
The problem is just having this lot of function, I saw no reason, it seems to be just to look organized, force the SRP, but the responsibility is too simple, so I said it can be done in 4 lines. The problem is not the type, it is the implementation.

– Maniero

2020/07/22 at 18:02
I’m really thinking that there’s an error in modeling, that CPF shouldn’t be an entity. It was in this sense that I understood the "modularization without any gain", although I understood now that you spoke of the methods. I speak responsibility for the sense of public interface, not the reason to switch from SRP. I won’t ask for counter-example code because it takes work to write :P I think I can visualize what a shorter code would look like, but if you want to give a description to help thank.

– Piovezan

2020/07/22 at 18:20
It can be yes, in many cases it is advantageous to be (although in Java it will always be bad :D) This entity is just too complex, just needs to be simpler. Actually I don’t remember all the details anymore and cooled in that, it may be that the interface is good, the problem is the implementation,

– Maniero

2020/07/22 at 18:40
The interface (things the object must know and do, "know" and "do", getStuff() and doStuff()) for example just display the formatted CPF (as @hkotsubo speaks in its response) and generate a random number. I don’t know if it’s a very meaningful or coherent interface.

– Piovezan

2020/07/22 at 18:51
There is also the problem (I didn’t notice, a colleague pointed out) that the constructor fires an exception if the CPF is invalid, that is, it is treating execution flow with exceptions. So as an invalid CPF is not an exceptional situation (the user can easily enter an invalid), it would most likely have some sort of validator.

– Piovezan

2020/08/04 at 13:39
2

@Piovezan On validation, that’s what I said in the reply: "Do you want the class to represent a CPF, or just want a verifier/digit calculator? If this were the second case, it would make more sense for a utility class with both public and static methods, which return only true/false". But if the idea is to represent a valid CPF, then it would make sense to make an exception, because then it wouldn’t let you create invalid instances (but if you can have an invalid instance, then you wouldn’t make an exception, etc.). In short, this is what has already been said, without real requirements, all that remains is to speculate :-)

– hkotsubo

2020/08/04 at 14:30
@hkotsubo You’re right, sorry, had not attacked me for that part.

– Piovezan

2020/08/04 at 14:57

Show 8 more comments

2 answers

4

As it is a theoretical class (without a real requirement), I will take it as a premise that its responsibility is to represent a CPF (and not only validate or format, because then some things could be changed, as discussed below). I mean, I pass a String containing a possible number of CPF and it only creates the instance if that number is valid ("valid" considers only if the check digits are correct if they exist - if they do not exist, the class automatically calculates).

If the idea is only this, it does not need regex, because it is only serving to verify that the String has a certain amount of digits. But you can do this check while calculating the check digits, thus saving a loop unnecessary (the regex will have to sweep the entire String to check that it only has digits). In fact you can save 2 loops (the one of regex and one of the ones that compute the check digits), as both check digits can be calculated in the same loop. An alternative to the constructor and the auxiliary methods used by him would be:

public Cpf(String cpf) {
    if (cpf == null) {
        throw new IllegalArgumentException("CPF não pode ser nulo.");
    } else if (cpf.length() != NRO_DIGITOS_COM_DV && cpf.length() != NRO_DIGITOS_SEM_DV) {
        throw new IllegalArgumentException("Número de CPF deve ter " + NRO_DIGITOS_SEM_DV + " ou " + NRO_DIGITOS_COM_DV + " dígitos");
    }
    this.cpf = verificaDigitos(cpf);
}

private static String verificaDigitos(String cpf) { // este método só é chamado depois que eu já sei que o tamanho da string é 9 ou 11
    int total1 = 0, total2 = 0;
    int multiplicador = 10;
    for (int i = 0; i < NRO_DIGITOS_SEM_DV; i++) { // percorre somente os 9 primeiros dígitos
        int c = cpf.charAt(i) - 48;
        if (c < 0 || c > 9) { // não é dígito de 0 a 9, inválido
            throw new IllegalArgumentException("CPF deve ter somente dígitos: caractere inválido na posição " + i);
        }
        total1 += multiplicador * c;
        total2 += (multiplicador + 1) * c;
        multiplicador--;
    }
    int dv1 = dv(total1, 1, cpf);
    int dv2 = dv(total2 + dv1 * 2, 2, cpf);

    if (cpf.length() == NRO_DIGITOS_SEM_DV) { // tem 9 dígitos, concatenar os dígitos verificadores
        return cpf + dv1 + dv2;
    }
    return cpf; // se chegou aqui é porque tem 11 dígitos e os dígitos verificadores são válidos, então retorna sem modificações
}

// recebe o total, o número do dígito verificador (i pode ser 1 ou 2) e a string do CPF
// se o tamanho da string é 11, verifica se o dígito verificador calculado é o mesmo que está na string
private static int dv(int total, int i, String cpf) {
    int resto = total % 11;
    int dv = resto < 2 ? 0 : 11 - resto;
    if (cpf.length() == NRO_DIGITOS_COM_DV && dv != cpf.charAt(NRO_DIGITOS_COM_DV - (3 - i)) - 48)
        throw new IllegalArgumentException((i == 1 ? "Primeiro" : "Segundo") + " dígito verificador não bate");

    return dv;
}

In the method verificaDigitos I make only one loop by the first 9 digits, calculating the summations.

A detail/curiosity is that inside the loop I could also do so:

if (i != 0) {
    total2 += (multiplicador + 1) * c;
}

It may seem strange to skip the multiplication of the first digit by 11 in total2, but in the end we will calculate the rest of the division by 11 and therefore the first term is redundant: if I have a number n and make n % 11, the result will be the same as (n + (11 * x)) % 11 (for any x integer). Adding any multiple of 11 will not change the rest of the division by 11, then adding "11 times the first digit" to total2 makes no difference (although in this case, maybe this if does not go beyond micro-optimization).

After the loop, i see if the first digit checker is equal to the CPF reported, but only if the size of the String is 11 (if 9, you have nothing to check). If valid, I proceed to the second digit. Note that in if of the method dv i don’t even need to check if the eleventh and eleventh characters are digits from 0 to 9, because if they are not, the value will certainly not match the checker digits.

Finally, I check if I should concatenate the digits at the end of String (which should only be done if her size is 9).

With this, the method that creates a random CPF can simply work with 9-digit strings (since the check digits will be calculated and concatenated within the constructor itself):

public static Cpf criarAleatorio() {
    String noveDigitosAleatorios = String.valueOf(ThreadLocalRandom.current().nextInt(100_000_000, 999_999_999 + 1));
    return new Cpf(noveDigitosAleatorios);
}

As in this particular case I already know that the String will have only digits, maybe it was worth refactoring the class to separate the validation of the calculation of the check digits, because in this case I do not need to verify that all characters are digits - is like "exercise for the reader" :-).

As for the adoption of a class for this, it depends on the objective. Do you want the class to represent a CPF, or just want a verifier/digit calculator? If this were the second case, it would make more sense for a utility class with public and static methods, which return only true/false (Valid or invalid CPF), or if you want to be more specific, return error codes that indicate what the problem was (no digits, wrong number, check digits do not match, etc.). Another alternative would be to encapsulate this error code in a class or enum specific (CpfValidationResult?) if it is worth creating such abstraction (as it is a theoretical class without real requirement, it is difficult to opine on what is most suitable).

However, if the idea is that the class represents a CPF, then I think it’s okay to cast the exception if the CPF is invalid (since it doesn’t seem to make sense to create an instance if the number is not valid, because then it would no longer represent a CPF in fact - but again, without real requirements, you can’t know, it has a requirement that allows having "CPF’s" with invalid numbers, I don’t know). But if I were a validator, I would use one of the options in the previous paragraph instead of making an exception.

Finally, the code didn’t even stay so shorter than the original, but you should not use this as the main criterion for deciding which implementation to use. Smaller code is not necessarily "better", have to use what makes the most sense in each case (and understand that in this case you did not need to use regex, for example). Not to mention that, as much as I like regex, use it is not always the best solution, in addition to generating a overhead often unnecessary.

One thing that can be improved are the names of the methods verificaDigitos and dv, but I am without creativity at the moment (maybe calculaEValidaDigitosVerificadores and calculaDigitoVerificador, respectively? I don’t know).

If you want to continue with regex (and not change so much the rest)

If you want much continue using regex, a little improvement would create them separately, since each call to String::matches compile the regex again. But since they will be used multiple times (every time you create a new instance) and do not change, then an alternative would be to compile them only once:

private static final Pattern REGEX_9_DIGITOS = Pattern.compile("^\\d{9}$");
private static final Pattern REGEX_11_DIGITOS = Pattern.compile("^\\d{11}$");

private static boolean ehCadeiaDeNoveDigitosNaoIniciadaEmZero(String cadeia) {
    return REGEX_9_DIGITOS.matcher(cadeia).matches();
}

private static boolean ehCadeiaDeOnzeDigitos(String cadeia) {
    return REGEX_11_DIGITOS.matcher(cadeia).matches();
}

Or:

private static final Matcher REGEX_9_DIGITOS = Pattern.compile("^\\d{9}$").matcher("");
private static final Matcher REGEX_11_DIGITOS = Pattern.compile("^\\d{11}$").matcher("");

private static boolean ehCadeiaDeNoveDigitosNaoIniciadaEmZero(String cadeia) {
    return REGEX_9_DIGITOS.reset(cadeia).matches();
}

private static boolean ehCadeiaDeOnzeDigitos(String cadeia) {
    return REGEX_11_DIGITOS.reset(cadeia).matches();
}

The difference is that the first option with Pattern is thread-safe, while the second (with Matcher) nay. Without knowing the scenario in which the class will be used, it is not possible to suggest the "best".

Detail that there are CPF’s that start with zero, and do not consider such cases can cause problems in the real world. So I changed their regex, which didn’t let the CPF start with zero. Now they only check if it has 9 or 11 digits, whatever they are.

The methods ehCadeiaEtc do not need to check whether the String is null, because you only call these methods at points where it is already guaranteed that it is not.

There’s another "weird" detail in this if inside the builder:

else if (ehCadeiaDeOnzeDigitos(cpf) && isValido(cpf))

It’s strange because within the method isValido we have:

private static boolean isValido(String cpf) {
    return ehCadeiaDeOnzeDigitos(cpf) && cpf.substring(NRO_DIGITOS_SEM_DV, NRO_DIGITOS_COM_DV).equals(moduloOnze(cpf.substring(0, NRO_DIGITOS_SEM_DV)));
}

That is to say, ehCadeiaDeOnzeDigitos(cpf) is called twice, unnecessarily. Decide where it is best to call it (in if or within isValido) and do it only once. Also decide whether to use the prefix eh or is (whatever, as long as it’s consistent across the code).

And you can stay forever elucidating about various things: the constants NRO_DIGITOS_etc are really necessary? If they are, do they need to be public? Methods private also need to be static? (functionally, it makes no difference, since none of them changes or accesses the state of the instance) Should the same class that represents the CPF be responsible for formatting and creating a random instance? And so on... (and without real requirements, it is difficult to decide or opine on something - even with requirements would be a good discussion)

1

Quite detailed @hkotsubo, thank you. I didn’t know this requirement to accept 0 at the beginning of the CPF, I must have got the bad source algorithm, or worse, I deduced it somehow. : You have to be careful with the requirements. I do not care about the regex, but I found it a bit exaggerated to optimize the number of loops, I didn’t need so much :) I would be satisfied with a loop just to validate the digits, it would be clearer, in my opinion. And well observed the call of ehCadeiaDeOnzeDigitos(), as you see, I wasn’t worried about performance. :)

– Piovezan

2020/05/22 at 18:17
@Piovezan Yeah, calculating the 2 digits in the same loop may be micro-optimization too, but I thought it worth mentioning that it is perfectly possible, and without complicating the code too much :-)

– hkotsubo

2020/05/22 at 18:31

Browser other questions tagged java code-review

You are not signed in. Login or sign up in order to post.

by Thiago Nascimento • 1 point · Answer 1 · 2020-08-04T11:48:48+00:00

Hello, you do not need to use class or interfaces for Cpf validation if you are using Hibernate. Use only the @CPF annotation as it already has validation implementation.