Why is it good practice to generate unique user code?

Asked

Viewed 253 times

9

Many enterprise systems use a customized unique identifier. You have the generation of id in the database and a custom code. When a customer search is done, the custom code (something like ZT2578B) is used instead of the Primary id key table.

If there is the unique identifier generated by the bank, why should I have a custom one?

What are the best practices or rules for generating such code?

  • 3

    "Good practice" is to use when you have a real reason for it. An example among several reasons to use custom formats (just to get away from what has already been posted): car plate in Brazil, which is a "unique identifier" were letters + 4 digits so that anyone could memorize when seeing to be able to annotate later. Nothing would prevent it from being a sequential numbering only, but it would be a pain in the ass to control it, besides losing the ease of memorization. Note: Too bad who wanted to do this crazy board Mercosul (full of problems) does not understand these basic things.

2 answers

12


For those who have no experience, these codes are usually classificatory and each character means something specific, a grouping, a way to use, it is as if they were tags. It may seem but they are not random, they may even expose sensitive information in some cases, so they do not serve security.

If there is the unique identifier generated by the bank, why should I have a custom one?

That would have to be asked for who did it, right? It must be a requirement of the system.

It may be that all these codes have been used all their lives and it would make no sense to change what everyone is used to because of the database.

It may be that the code used there has a specific semantics that helps identify what the item is, so it is better for people to manipulate than a ID DB sequence. So it’s a matter of UX.

It could be legislation or regulation or industry-wide common practice.

It’s almost never about safety, it doesn’t make sense to be, because it’s not about security. If someone needs to make use of this to have security in the system then everything is lost. This is called obscurity security and is known to be problematic. If you have another security measure do this does not change anything, if you do not have is insecure. If they have done so, and only those who have done so can answer, tremble with fear.

It is very easy to circumvent the assembled code and not sequentially after someone has seen some. Any code should not give insecure access in any way. Every system shall be prepared for secure normal access in forms not foreseen in the system and prevent unsafe access. And note that the code is classificatory, not a long and almost random code, as the UUID is (which has its problems, but it’s another matter). You need a good mechanism of authentication and authorisation that controls security, then the code is irrelevant to it. Basic rule: data coming from outside is insecure by nature, this includes the codes received.

If the sequential code is sensitive, this one that has a qualification can be more sensitive still and deliver relevant information, not that it matters.

I didn’t even get into the question that maybe this code is only for internal applications, because the question doesn’t make that clear. But do not fall for this security in any case. I repeat, it is 99% certain that it is not security, the 1% is wrong use of trying to provide security. Vulnerability will be to think that the code gives some security, the "invader" knows that not, and counts with this ingenuity of who made the system. Insecurity always occurs due to ignorance/ingenuity of the developer.

There is another question that can be asked: why not use this code as the primary key?

I have answered in Values that can be entered as primary key. In short, this code may not be stable, so it’s not a good candidate.

What are the best practices or rules for generating such code?

I’m glad you asked why. Too bad this is a case I can only speculate.

Good practice is a crutch for those who do not want to learn real motivation. Understanding the workings of that in all its details gives you allowances for making the right decisions instead of adopting something that someone has already adopted for her case, without considering its context, so you fall for that bullshit. Good practice may have been created because you don’t understand the subject. I know a lot of the wrong things they do around being alleged as good practices because people don’t understand the context, copy without assessing the real need. It may even be that it was good practice decades ago, but it doesn’t make sense today, but everyone keeps doing it.

If you search here about the use of primary key has enough information (I myself answered several) to understand what may or may not be used in this way and why the primary key must be stable.

Now, regarding the code used by the company to identify the product is not something that should be decided by the developer, for this person what is worth is the id database. This secondary code is for people to adopt, it’s to improve the user experience, and it should be decided by the people who are involved in the business process, it’s not technology. It is common in some organizations to have a committee to decide these codes so important that it is.

Just make sure it can be accessed easily, have a secondary index to find quickly.

In general when you have this type of code should not be exposed the ID table, but has exceptions.

You need to see if its creation should be validated in any way, if there is a clear rule that can be confronted with other input data.

I’m not saying I know what’s best for this case, just giving some idea to think.

4

I believe the main reason is this: security guard.

And we can mention two points about security: preventing a vulnerability from being exploited more easily and the security of information itself.

Using the sequence to explore a flaw

Sequential identifiers are commonly used in tables by means of identity or quence (depends on your database). One advantage of these identifiers is that, because they are sequential, they are more readable to users, because just by looking at them you can already get a sense of the amount of records in the table, makes it easier to sort and have (more or less) the order they were created, etc.

However, because these are sequential, you can also easily guess the identifier of the next record. This may be considered information sensitive when you need to uniquely identify this record in a system that wants to expose this data externally, because by exposing this identifier you will be providing sensitive (sensitive) information that can be used in a possible attack.

That is, through some vulnerability of the system the attacker can exploit the failure knowing all possible identifiers of the records. I can quote at least a case in Brazil where this occurred.

See an example of an API providing this sensitive information:

GET /pedidos/105

It is clear from the above request that these order records can be identified sequentially. That is, there is probably the request with identifier 106, 107, 108, etc. A vulnerability of the system in this part, or any other party, could compromise the data of the requests.

For this reason, many systems choose to generate unique, non-sequential identifiers, so that it is difficult for an attacker to take advantage of this information to draw any conclusions in their favor. This generation does not necessarily need to be in the database, as it can also be generated independently by any system (see UUID, for example).

This way, our same API would be:

GET /pedidos/123e4567-e89b-12d3-a456-426655440000

Keeping the attacker unable to find any other possible order record number.

Using the sequence to extract business information

There is also the security of the information itself, because this identifier alone can say a lot about the data of the system.

For example, a potential competitor can draw several conclusions based on this sequential numbering. It can analyze the volume of orders you are generating over the course of days and, based only on the identifier, know what volume of orders your system is generating.

Unique code or identifier: which to use?

The sequential identifier solution does not exclude the use of a "unique code". You can have both. Depending on the unique code and the volume of data your tables have, it is preferable to keep the sequential identifier as it probably has, a better performance.

  • 2

    Dear @Dherik I don’t want to disagree with your answer, but UUID doesn’t focus on security unless you’re talking about security outside of the application level, like just avoiding from the person deducting the amount of records that would be such a security breach. UUID is not guaranteed as single, it can "fail" and even if this was not the case. In practice even an attacker could deduce some things, or even get an ID from another user would only be a security breach if something in the back end is badly done.

  • 2

    You see, here at Sopt Ids are all exposed, this has never influenced security. I am not entirely disagreeing with the answer, but perhaps the term "security" and "vulnerability" should be used with a little more "context", or perhaps simply avoided, because this creates a confusion of understanding. It may be that people think it increases security to use a "unique code", Youtube generates urls like this, but probably never influenced anything. It may be a simple deception of people, or it may be for "better" readability.

  • tipo apenas evitar da pessoa deduzir a quantidade de registros que seria a tal falha de segurança, Yeah, that’s part of my point. The other part is that the attacker could exploit some kind of vulnerability knowing the order identifiers. I corrected my answer to make it clearer, I was really confused.

  • 2

    What kinds of vulnarebility can you explain knowing my Sopt ID here? Appears here in the URL: https://answall.com/users/3635/guilherme-nascimento, the value is 3635.

  • Let’s just say your information was all private. If I knew your ID and OS Ids were sequential, I could take advantage of an OS vulnerability (for example, some route that did not require authentication) and take all users' private information. This, in fact, has already happened in a episode that became famous in Brazil taking advantage of the problem mentioned above.

  • If I understood, knowing that it is sequential (or not), I would still have to make use of another problem, which is more serious, other existing gaps, is this?

  • @Guillhermenascimento exactly!

  • 2

    Great, now that I understand what you wanted to express let’s do a comparative to understand the points of what security is about. Imagine someone stealing my login and FTP password, for their arguments the security equivalent of "unique code" would be to create files, scripts and folders with random names to confuse who already got my access to FTP improperly. But tell me, shuffling the names and such brought some security indeed?

  • 2

    Surely the issue is not security, or almost certainly. I’d say you have a 1% chance of being, but that’s because whoever did it has no idea what they’re doing. I even blame this because this tip ode is basically sequential, just not so obvious, but it’s obvious to look at a few codes. This tip ode code is very common in certain types of activity, it is not random and cannot be confused with something similar to a UUID. It’s an option, but highly improbable, and if it was a mistake, it’s not safe. Then I’ll change my answer to talk about it, but I won’t go too far.

  • @Guilhermenascimento, in your example, I see no advantage, but I will give another example upon yours. Let’s say a social network has private profiles and people control access to their own photo album. The photos themselves are public but you don’t know the direct link to them. However, if you create some kind of "guessable" pattern in the name of the photos, you can access them without having access to the user’s album. The late Orkut had this problem, then they fixed it.

  • 2

    UUID alone does not make any difference in security. There is confusion in response and comments on the subject. The GET case is in this sense, the problem is not being sequential, it’s the way you’re using it. You can have a sequential ID without any problem, and never show it to the public.

  • But in the latter the solution would not even be "exclusive code", it was total error of the way they stored, facebook also had this, if I am not mistaken, the storage strategy is that it was all wrong, put "exclusive code" would not necessarily solve, so much so that the "solutions" applied on these social media was to adjust the storage and not create exclusive codes, but I think there are the points that I meant, it may seem that one thing solves, when in fact it is kind of a painted wooden wall to look like a concrete wall.

  • @Bacco, I did not say that one replaces the other, I only explained why a "unique code" exists. Você pode ter um ID sequencial sem problema nenhum, e não mostrar ele nunca para o público. Se tem problema ele ser sequencial, mostre outra coisa no URL que leve ao ID desejado.. This matches what I explained above.

  • 3

    I do not know if it is clear that this does not provide security. But I think the question was not understood by many people, so I even added an introduction to my answer. The question is about what made someone adopt it. The question is not very good because only the one who made the system can answer. A possible speculation is security, but a bad one, only adopted by mistake, It was not said, gave room to speculate, but nothing indicates that the system is web and accessed externally, let alone implies in UUID, which is another thing, mixed things up a lot.How some people liked the answer worries me

  • 3

    Because it means that they believe that this is really safe. Who wants to "invade", will invade even doing this, if you have another problem that allows it, this is like putting a band aid in a shot hole taken. More importantly, the code is too simple to be something used to "increase security". It’s not random that it would help a little to hinder certain patterns, but not to increase security which is too strong a word for what could happen if you used random code.

  • @Maniero of the few times I saw a unique code used in a table, it was usually to expose something out, so I thought it was important to quote. About UUID, I cited UUID as an example, there are other alternatives. The cases I mentioned above (Orkut and Catho) were examples of the use of a vulnerability based on this sequence that could have been avoided or hindered if a non-sequential identifier was used.

  • 3

    @Dherik then you need to see other things, almost all the things I saw were not that. Well, I don’t work with startups and I don’t think I’ve worked with many "normal" companies that use it. It’s extremely common. The case you mentioned would not have been avoided because the data was public to everyone who has an account on these sites. There was no security breach, he could have had any code and everything would be captured. It could have been a slightly bigger job to make a script, and look there, maybe you already had something that made things difficult and didn’t stop you. You’re inferring that you didn’t have

  • @Maniero then I believe you also need to see other things. I don’t work with startups (I don’t know if I understand this part, but that’s okay), so I believe it was all "normal" companies. About avoiding or hindering, this may vary depending on the vulnerability shown and the number of information the attacker will have (or who has provided to him). The less information you provide the less chances you will give someone to exploit a security flaw in your system.

  • 3

    This theme of "security by obscurity" has long been demystified, I recommend a research who has interest in the discussion, so each one draws its conclusions. It is subject more than beaten. In the specific case were given I think two examples where the single id would supposedly help (based on the assumption that in that very specific, hand-picked scenario and bug, "nonsequencing" would help something, as if no one could make a script that tried various values) and forgotten the other thousands of cases where it is sequential and gives no problem and nor will ever give.

  • 1

    @Dherik I see other things, I see terrible things being done by inexperienced people who do things that startups do even outside of startups, because that’s what’s trendy, and they follow what startup says is good. I have experience having developed ERP serving thousands of companies (almost all large, some medium), almost all use codes like the one quoted in the question, and none is for security, is for internal organization or sector. What I think is bad is that people think that this gives some security, this is precisely why there are insecurities "if take before do not get pregnant".

  • @Bacco, I am aware of the term security by obscurity, it is a subject that covers a lot of things. I just don’t think it applies in the case.

  • @Maniero, I have never worked with ERP, but I have worked with internal products from large companies as well (not that this or that is a strong argument). Of course, only the use of "external code" will not bring security to the application, but I believe that this point is already clear in the comments and in my reply.

  • @I’m sorry, but the problem of your answer can be summed up precisely in that last comment to Bacco, that is exactly security by obscurity, without any different nuance. And in your last comment you say it’s no longer about security, it’s not clear in your answer. It says in the first line that the leading reason (not even secondary, punctual) is safety. She then says no, but then she denies the reason for the answer, and without it she has no reason to exist. Anyway, I’ve lost too much time. At least whoever bothers to read everything will see that this is a mistake.

  • @Maniero, sem nenhuma nuance diferente this also debatable. In the second paragraph of my answer is there the reason I put security guard. For me you understood what I meant, it is not a matter of contradiction, just do not agree (which is perfectly healthy). I agree.

Show 19 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.