How does a birthday attack work?

Asked

Viewed 2,030 times

34

I’ve heard of a technique called that, and something about exploring hash collisions.

But how does this technique work, and where it can be applied?

2 answers

40


The attack

This attack is based on birthday problem: If people are in a room, what are the chances of two of them having the same birthday? As there is a fixed number of days in the year, and the probability of being born on a given day is random and equally probable for each day, the chance of person A being born on the same day as person B is fixed: 1/365. If a C person enters, there is a 1/365 chance between him and A and 1/365 chance between her and B. As the number of people increases, the probability increases until reaching 100% when n is 366 (ignoring leap years to simplify).

However, given the way the number of combinations two by two grows rapidly as the number of people increases, this probability exceeds 50% long before the number of people reaches 183: only 23 people are enough. And after 70 people, the odds are over 99.9%.

Similarly, hash functions take as input a data of arbitrary size and output a data of fixed size. So if a very large number of different inputs are used, the probability of hash collision becomes greater and greater. That is, one can produce collisions simply by generating a sufficiently high number of distinct inputs, and by the anniversary problem that number is much smaller than the total of separate exits.

By way of example, if a hash function has a 128-bit output, a brute force attack would have to examine on average 2^127 inputs to produce a collision. A birthday attack, on the other hand, would do the same with just 2^64 attempts (source).

In practice

Practical applications, however, are quite limited. Unless you are projecting a cryptographic primitive - in which case one has to ask: 1) how many messages would need to be processed in order for the chance of collision to become significant? 2) How fast can you process these messages? 3) What can you do if you encounter a collision? (i.e. where it benefits you?)

However, cryptographic systems - which are usually composed of several distinct primitives - are often designed to be resistant to birthday attacks or, at least that the cost of a birthday attack is at least as high as its greatest possible vulnerability (and this higher than would be likely to be exploited in practice). So this is not a concern that the average developer needs to have, regardless of the scenario (e.g., password hash? irrelevant, be it? done correctly or not; checksum files? good hashes are not vulnerable, bad hashes are bad for other reasons; cipher? ditto; etc.).

An example

Not to leave without example (theoretical; as said, in practice this attack is not usually feasible), I will reproduce here the example of Wikipedia:

Let’s just say Mallory wants to trick Alice into digitally signing a fraudulent contract. It elaborates a very large number of legitimate contracts - varying only some details, such as punctuation, spacing, etc., but maintaining the semantics of the same constant - and a large number of fraudulent contracts as well. It produces everyone’s hash (the hash is usually the first step in a digital signature system - and the same can be done by Mallory without Alice’s private key knowledge), hoping to find a collision between an honest and fraudulent version.

In possession of this collision, she delivers the honest contract for Alice to sign. Alice signs with her private key. However, since what is signed is a hash of the contract and not the contract itself, by signing that hash she implicitly also signed the fraudulent contract (since both have the same hash). Mallory can then take the signature, attach it to the fraudulent contract, and present it to a third party - who will trust that Alice signed it once the digital signature is valid for that document.

4

"Birthday Attack" is used solely to measure how secure (or not) the cryptographic algorithm is by brute force techniques.


The "birthday attack" is nothing more than a brute force. The encryption algorithms in functions of hash are exposed to two vulnerabilities: brute force attack and cryptanalysis.

The brute force attack depends on the amount of the value of the hash bit. Whereas a crypto-analyse, on the contrary, relies on a specific flaw in the cryptographic algorithm.


Pre-image attacks:

A y obtained such that H(x) = y.

The "pre-image" resistance ensures that a hash is truly one-way, and it is not possible to generate the message using the hash. If the hash is not resistant to the pre-image an attacker will be able to read the message and/or obtain the secret code. For example imagine using an insecure hash to pre-image attacks on HMAC signatures. If someone you intercept will get used key and can change the message and sign using the discovery key.

This indicates that you have a hash h and a message m and knows that the hash used is H( s || m ). You wish to discover the s so that you could generate a new message and create a valid signature. For this you would need to reverse the hash function so that you can break the "one-way", yes MD5 is also vulnerable to this.

Second image attack pre-image:

A y and x such that H(x) = H(y) however x != y.

The "second pre-image" is to generate a new message, different, so that two messages have the same hash.

A person wants to find a value x such that H(x) is equal to a hash h. For example, imagine you have a text a and the H(a) results in h, you want another text, b such that H(b) also be equal h, thereby creating a collision.

To do this you will generate a value of b and experience each generated value until H(b) = H(a). The number of attempts required to find a collision is proportional 2**m, the m is the number of bits, so you should generate until 2**(m-1) values of b to find one that matches hash resulting from H(a).

Attack of the birthday:

If the effort required would be 2**(m-1), but using the mathematical result of the anniversary paradox the effort will be less. The anniversary paradox was explained by @mgibsonbr, basically this makes it possible to 2**(m-1) reduced to 2**(m/2), I’ll explain it differently.

If you need 23 people to have a 50% chance of finding a person among themselves with the same birthday. So, how many people it takes for me to have a 50% chance of finding someone who has the same birthday that I? 183 (365 / 2).

That’s the same with the hash, how many hash I need to generate (with a 50% chance) to find a hash that is the same as mine? x/2, in case: 2**(x/2), where the x is the number of bits.

If a 64-bit hash is used, only 2**32 attempts. In 1994, Van Oorschot proved it possible to break the MD5, creating collisions in just 24 days for $10 million:

As a particular example, a $10 Million custom machine for Applying Parallel Collision search to the MD5 hash Function could complete an Attack with an expected run time of 24 days.

Remember that the size of the hash of the MD5 is 128 bits and that was proposed there in 1994, in which case this same machine would take over four thousand years to crack a 160-bit encryption, like the SHA1. Nowadays a 160 bit hash, like the SHA1 it’s no longer safe, but it was then.


Examples:

The same example of @mgibsonbr can be applied to something more serious than SSL, which is based on signatures! in 2008 a group managed to falsify SSL using the power of literally two hundred Playstation 3 (no mistake, is literally 200 x Sony Playstation 3). This was possible by creating SSL certificate collisions using the weak MD5 algorithm.

In 2009 there were still about 14% of websites using vulnerable certificates, perhaps still exist today, but browsers no longer rely on MD5 certificates. At the same time the SHA-1 was widely used, although it was no longer considered safe since 2005. The reason for the failure of the SHA-1 is by cryptanalysis, in theory the brute force would have (by calculating the "Anniversary Attack) 2**80 operations, but by failure in encryption this number falls to 2**69.

In 2014 Google Chrome no longer trusted certificates SHA-1, since it was no longer considered safe since 2005. This showed a triangle image next to the lock in the corner of the browser and in November of the same year the lock was removed, thus not shown to the user that the website as "Safe" (but rather as a normal HTTP) and in 2015 is shown as "Unsafe".

Microsoft will only allow the use of SHA-1 certificates until 2016:

However, SHA-1 Code Signing Certificates should be used for Targeting At These earlier Platforms only due to the Fact that SHA-1 is no longer considered to be Secure and is susceptible to Attacks. Windows 7 and later Platforms will stop Accepting SHA-1 Code Signing Certificates after January 1 st , 2016. Software Developers may need to use Both SHA-1 and SHA-2 Platforms.

Source

This indicates that nowadays Windows only accepts signatures of the "family" of SHA-2 (understand as SHA-224, SHA-256, SHA-384, SHA-512), the reason is simple: it has become easy to collide signatures.

In a matter of a few years, as the advancement of technology will cheapen the cost of such brute forces the other cryptographic algorithms, such as the SHA-256 will be considered totally unsafe, it is only a matter of time.

  • Good answer, plus one. As for this final statement of yours - that it is only a matter of time before SHA-256 becomes unsafe - I agree that this can happen because of cryptographic breaking, but I disagree about brute force. In the absence of quantum computers, the SHA-256 can be considered "safe for all time", and in the presence of them, the SHA-512 must be sufficient (again, against brute force, not against Criptanalysis). It’s a matter of thermodynamics, not just technology. See that post on security.SE for more details.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.