Is it recommended to use natural primary key?

Asked

Viewed 2,869 times

12

Natural primary key is one that contains real data, not just a random value, but an integer.

Example:

// comum
CREATE TABLE pessoas (
  id INTEGER PRIMARY KEY,
  cpf INTEGER NOT NULL,
  // outros campos
);

// chave natural
CREATE TABLE pessoas (
  cpf INTEGER PRIMARY KEY,
  // outros campos
);

In the above case I use the person’s CPF as key, since it does not repeat. This is recommended?

  • I wouldn’t use it, and I wouldn’t recommend it. In the case of the CPF for example, sometimes two people with the same number appear. It’s rare, but it happens. Ex http://g1.globo.com/fantastico/noticia/2014/02/mulheres-descobrem-que-alem-de-nomes-iguais-tem-o-mesmo-cpf.html

  • I understand that was an "Example", as put, is not the doubt on CPF... We can imagine a system where no one cares about registration that doesn’t have your number. I have worked with similar demand, magazine, that to exist needed to have a ISSN... CPF and ISSN are good examples of Urns guaranteed by a Authority control. In this context I see an interesting question.

4 answers

13


Short version - No, it’s not recommended.

Long version - I would understand your question as follows:

The Benefit of Owning a Database without Process Artifices versus the cost of maintaining a natural primary key results positively?

So let’s have a quick list of positive points for each option:

Natural Primary Key

  • Actual representation of the data scope
  • Does not saturate the scope with an artificial identifier

In this implementation, your data model is as close to the universe represented by the scope as possible. Cross-referencing data from external databases is facilitated by the fact that your data is used as a natural identifier (e.g., CPF or ID card.)

Ids, Guids and derivatives

  • Decouples the represented data process
  • Provides the coupling of its own bonding mechanisms and guarantee of uniqueness
  • Prevents data repetition blocking
  • Allows the presence of partial data
  • Security by obfuscation of natural data
  • Immutable data prevent index fragmentation
  • Decrease of import errors and data merging

A data model is designed to reproduce the behavior of the data present in the universe defined by the scope. However, being an abstract model, it must provide its own mechanisms to allow this representation.

If you use a natural model, you are delegating responsibility for oneness. If this uniqueness is violated, your model will be negatively impacted. For example, two people with the same CPF, or a CPF that was erroneously typed and that happens to be already present in the database.

In situations where data is incomplete and natural identifier fields are not yet present, a model of these would still allow the creation of records containing a partial representation.

  • I liked the long answer, I think we can imagine a system where no one cares about registration that doesn’t have your number. I have worked with similar demand, magazine, that to exist needed to have a ISSN... CPF and ISSN are good examples of Urns guaranteed by a Authority control. You consider Urns to be good candidates for your Natural Primary Key concept?

  • @Peterkrauss some mental experiments - 1) Partial registration: where the final format of the record will necessarily have a mandatory URN, but the creation process is fragmented (for example, creating a user on the system where the first step is email confirmation). 2) Guarantee of uniqueness: Even ISSN is not unique for publications (see JID). If unity is important to the system, then necessarily a supplementary mechanism must be implemented. 3) Scope expansion: How to deal with a new need (e.g., ISBN in addition to ISSN?)

  • In these cases, Urns may be candidates if and only if several conditions are positive; nevertheless, the final solution may have a stiffness that is difficult to justify. A cost/benefit assessment of future model updates may be required.

12

I’d say it’s not recommended.

1. Not everyone has CPF

How to do it with foreigners? And minors who do not already have CPF? In these cases they can not create account?

2. CPF generators

Imagine that someone was creating an account on any site and for mistrust or even for malicious intention such people decided not to use their own number of CPF and picked up a number of CPF through a generator. One fine day the real CPF number holder decides to create an account on the site, obviously it will be prevented because the system will inform you that there is already an account attached to that CPF. How to resolve such a situation?

I’m not saying it’s impossible to solve, but you’ll be creating extra complexity just because you didn’t want to use an auto increment as the primary key and decided to use the CPF.

3. Same user cannot have multiple accounts

Is there any excellent reason not to let the user have two accounts? You have no longer allowed the child to have an account, nor will you allow the parent to have two accounts (one for him and one for his child, for example). Maybe he takes Grandma’s number, or he uses a generator (Sigh).

4. And if the user cannot recover password?

Suppose the user has created an account and never accessed again, after a while he tries to access it again, but he does not remember the password and also lost the password or changed e-mail. You will need an extra plan to recover the user’s password and maintain exactly the same account that he created at first.

Again, I’m not saying it’s impossible, but it’s an extra complexity that you’ll have to develop, while you could just let the user create a new account.

5. How to handle change of ownership?

Suppose a Sky subscription for example, if one day I want to transfer ownership without having to cancel the current account, return the device, sign a new account and receive a new device. How to do?

Okay, you can change the PK and use triggers or whatever, but it’s again another extra complexity in your system.

6. What is the advantage of using CPF as PK?

Unless you have an excellent reason, the only "advantage" I’ve thought of so far is: save a field on the table.

  • +1 Great answer!

  • It is good to remember that the CPF is an artificial key in the base of the Revenue.

5

It is not a good idea and there are several reasons for it! First the business rule can change, either by a change in law, or because the business has changed, and in this case, you would have problems with the natural primary key. It is recommended to create the table with a generic key (ID) and if necessary, ensure that the natural key is a Unique Key. Creating a Unique Key ensures that the natural key is never repeated, but the relationship with other tables, will be accomplished by the "generic" key, the ID.

5

Not a good idea. See some arguments:

1 - A person can be both a natural person (CPF) and a legal person (CNPJ).

2 - If the person does not have the document in hand, or the user who is registering does not have this data, the registration would simply not be done, or you would have to invent a temporary CPF and on top of that valid (if the system does CPF validation).

3 - In the future the government may extinguish the CPF and create a single document, or who knows how to change the name CPF for another name.

4 - I don’t think it’s a good idea for a PK to depend on a user-informed data. If you want there to be only one person with a certain number, just create a UNIQUE KEY.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.