Suitable type to use with CPF is numerical or character?

Asked

Viewed 1,681 times

1

I have a database with two tables that store the CPF. In one of them the CPF is stored as numeric(11,0) and in the other the number is stored as character.

If there is any difference, which of the two types of data occupies the most storage space?

2 answers

10

The character occupies slightly more, but is correct. CPF is descriptive information that happens to be composed only of digits, one day may not even be like this. You don’t have to do calculations with it, it doesn’t quantify anything, so using a numerical type doesn’t make any sense.

See more.

This can be easily tested in the database:

SELECT pg_column_size(CHAR(11) '999999999'), pg_column_size(VARCHAR(11) '999999999'), pg_column_size(NUMERIC(9,0) '999999999');

Behold working in the SQL Fiddle. Also put on the Github for future reference.

Note that the CHAR is what occupies most (what surprises me, this seems something bad from Postgresql), but it is the most correct semantically. Think about it, if you know that this information has 11 characters, why would you create a type that size is variable? In normal databases this should be the most economical since it does not need any control metadata, who knows why Postgresql does it. In another database can give a different result. It would not be better to use a general solution?

Interestingly use a VARCHAR is a premature optimization.

  • Reading the documentation and see how much each type occupies.

  • Hello, Maniero! Thank you for your reply. His reply and that of Lacobus helped me a lot to understand, however, the latter helped me with a subtle detail, indicating the type of data character varying.

  • Oh yeah! I didn’t even think to read the documentation that’s why I posted in ptSO. /s

  • @Tonymontana I set how to measure.

5


I understand that a CPF is not a numerical type, but a string of digits (There are Cpfs that start with zeros to the left), so a better abstraction of the data would be the use of a type character varying(11).

And even if the guy character varying(11) takes up more space than the type numeric(11,0) and the intention is to "optimize", yet it is not worth replacing it. Optimizations of this nature are almost always "precocious".

Browser other questions tagged

You are not signed in. Login or sign up in order to post.