Use unique identifier timestamp in a PHP process?

Asked

Viewed 73 times

1

That is the scenario:

A trading site that handles many transactions per second. Hosted on high-performance multi-core computers at AWS.

The function is being used time() php to generate a unique transaction identifier.

I don’t know if it never happened by a simple chance or if it can happen, but there is the possibility of 2 or more users who are connected at the same time generating 2 or more similar identifiers ?

2 answers

2

Yes, the chance of conflict is great. Ideally, each transaction should receive a unique incremental identifier, but if this is not possible, I recommend using a prefix with the machine identification. Looks better even using a hash.

For example:

<?php

 $id = sha1($serverName.time());

You can even add more information like Session Id.

  • But can the algorithm that will generate the tbm hash generate two similar ids? see what is being provided: servername (which is a constants and time() which will be the same for 2 the more users who connected simultaneously. Correct?

  • Yes, there is also the risk of collision. It’s a risky system, but you can invent something that reduces this risk by combining variables, such as the first 10 hash characters added to the last characters of the session, added to something else. The database is relational ?

  • Yes, the database is relational. What I usually do in such a scenario is generate a pseudo random value and before recording I check if this value already exists, If yes generate another value again, If not, record in the bank. The fact here is that Boss he wants to use a timestamp but I am reaping arguments to refute this approach.

  • The theme in question is "collision". You will find numbers on this.

0

Handles many transactions per second and uses time() then you have many collisions per second, naturally. The time() returns the date in unixtime, that the number of seconds counted since 1970.

Therefore, you use the time() to be a unique identifier so you can only process one transaction per second. If in the same second two transactions are processed will use the same date, see this example.

The microtime() in turn informs the microseconds, which is less worse, but there is still risk, if there is no type of lock, of two transactions having the same identifier.


If you want a unique identifier, or closer to it, use CSPRNG, this will make it random (is actually deterministic, but ultimately). The probability of two transactions generating the same set of bytes is low, and is even lower if the set of bytes is larger, for example:

$random_bytes = random_bytes(128);

You can use bin2hex/base64 or unpack (which is preferable) to represent the bytes in some way.


If you want to return numeric values you can use:

implode('', unpack("N*", random_bytes(4)));
// Resultado: 555172730
// Resultado: 1360929444

Test this.

The N indicates that it is a unsigned long (Always 32 bit, big endian byte order), every 32 bit (4 Bytes) represents a numerical sequence.

The unpack supports other representations, such as H that returns in Hex, each 4 bits is equivalent to a letter/number (0-9 and A-F), does exactly what the bin2hex makes, for example:

unpack("H*", random_bytes(32))[1];
// Resultado: de2e461f0955d58dc4f6fbd55afafb779e976be781651d80832e91824fd22d35
// Resultado: 2688ffde028dc8893446912d057d388571e21dbcafe0c3185c5ec32d7f047f35

Comparison, simulating that ran 4 transactions per second (a transaction every 250000 microseconds):

time():
1493901169
1493901169
1493901169
1493901170

random_bytes(4) + unpack(N):
930833536
883463788
1732194618
265243744

random_bytes(32) + unpack(H):
a679d799284b4c2e3483c1f457a6d5cf9a500d8b37ecdbc0f804fc614f167ea3
a461dff841f8e43a4c377e507e04a47d6fbd8980a0f46dbb9bcaf2678ebadeac
68bdd0709e3dc2460dadd3dccf25771d98f6665159186270c2869dcaf0e79942
c82dca8125a82925872261bf5fca8ff026b9d0546a39658d8cbdd3a2c5267d33

Test it out here.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.