How to hash passwords safely?

Asked

Viewed 31,327 times

384

If I do the hash of passwords before storing them in my database is enough to prevent them from being retrieved by someone?

I’m just talking about the recovery directly from the database and not any other type of attack, like brute force on the application’s login page, keylogger in the client and cryptanalysis Rubberhose. Any form of hash will not stop these attacks.

I am concerned to make it difficult or even impossible to obtain the original passwords if the database is compromised. How to provide greater security in this regard?

What additional concerns would prevent access to passwords? There are better ways to do this hash?

  • 3

    If possible use Argon2, he was the winner of PHC, has more options of adjustments, if compared to PBKDF and Bcrypt.

  • Passwords are usually short, which makes brute force attacks easier. The hash function has a fixed size feature. This makes it possible to generate all possible passwords and find hash collisions. The common practice is to increase the password (with some padding) before calculating the hash, so your hash map is asymmetrical, and will have more possibilities in case of brute force attack. This padding is stored in another database (physically and logically separated).

  • Once I did something like this: The user enters the password into the system, the system calculates the hash, after that applies an encryption based on the Feistel cipher with many rounds (enough to take 2~3 seconds) and stores the encrypted hash in the database. A reverse brute-force attack would take 2~3 seconds per attempt, if the "attacker" knew its cipher. The problem is that password validation takes 2~3 seconds, but what is 2~3 seconds in exchange for more security.

9 answers

405


Theory

Hash password is always a secondary defense. A server that authenticates needs some information in order to validate a password. A simple system stores the passwords themselves literally, and the validation in this case is mere string comparison. In this case, if someone just peeks into the database archive, they’ll see too much information. This kind of gap happens in practice. A backup in the wrong place, a hard drive changed and not erased correctly, an SQL injection, and so on. See a detailed discussion on this blog.

Even so, as the content of a server that validates passwords necessarily includes the data for this validation, someone who has mere copy of these can make a dictionary attack offline, trying potential passwords until some match, and this kind of attack is inevitable. So what we can do is try to make this attack as difficult as possible, and for that, we have these tools:

  • Cryptographic functions of hash: are mathematical functions that at the same time are efficient, no one knows how to reverse. The server can maintain a hash of a password; when making the comparison, just use the same hash in the second value and see if they match; anyway, looking at the hash can not know which is the original password.

  • Salts: one of the advantages of the attacker is the parallelism. The attacker picks up a bunch of passwords protected with hash and wants to find out as much as possible of them. He can simply make a hash of a potential password and compare that one hash with hundreds of different records. You can also use pre-calculated hash tables, including Rainbow Tables;

    The characteristic of parallelism attacks is to act on multiple passwords with the very same hash function. Use the salt not having a hash function, but a portion of it. Ideally each password should use its own hash. A salt is a way to select a specific solution for hash between a large family of functions. If properly used, it can totally end the parallelism.

  • Slowness: computers are getting faster and faster, as theorized by Gordon Moore, co-founder of Intel. Human brains are not. Each year attackers can test more and more passwords at the same time, while users do not remember more complex passwords (or refuse to remember). For this, we can make hashes stay extremely slow using functions that require many iterations.

We have some very common cryptographic functions, like MD5 and the SHA family. Construct a function of hash using elementary operations is not an easy task. When cryptographers want to do this, think deeply, and organize tournaments where the functions "fight each other violently". After hundreds of them turning, stirring, poking a function for several years and finding nothing bad to talk about it, they begin to admit that, perhaps, that function can be considered more or less safe. That’s what happened in SHA-3 competition. Us we have to use this means of building these functions because we don’t know any better. Mathematically we don’t know if safe hash functions really exist, what we have are candidates (this is the difference between "can’t be broken" and "no one knows how to break").

A basic hash function, even safe as hash function is not suitable for passwords, by the following:

Then we need something better. What’s more, properly merging a hash function, salt and iterating, is no simpler than designing a hash function -- at least if you want a safe result. Again you depend on standard constructions that survived the continuous massacre of cryptographers vindictive.

Functions of hash good for passwords

PBKDF2

The PBKDF2 comes from PKCS#5. It is parameterized with iteration counter that starts with 1 and has no upper limit, an arbitrary salt with no size restriction, and the required output size. (PBKDF2 generates a configurable size output) and a PRF. In practice PBKDF2 is always used with HMAC, which is built in turn on a hash function as well. When we say "PBKDF2 with SHA-1", it actually means "PBKDF2 with HMAC with SHA-1".

Perks:

  • It’s been defined for a long time, and it’s still "unharmed".
  • Already present in several frameworks (as .NET, for example).
  • Highly configurable (although some implementations do not allow the choice of hash, such as . NET which uses SHA-1 only).
  • Approved by NIST (observe the difference between hashing and key derivation; read on below).
  • Configurable output size (read below).

Disadvantages:

  • It’s CPU-only, and much more GPU-accessible (the server is usually a normal PC, for usual tasks, while the attacker can invest more in specialized hardware and take advantage of it).
  • You have to adjust the parameters on your own (generate and manage the salt, number of iterations etc). There is a Default encoding for PBKDF2 parameters but it uses ASN.1, so people avoid whenever they can (ASN.1 is complicated for non-intelleurs).

bcrypt

The bcrypt was developed reusing and expanding elements of a block Cipher called Blowfish. The iteration count is a power of two, which is much less configurable than that of PBKDF2, but sufficient for normal use. This is the core of the hash of passwords of Openbsd.

Perks:

  • Implementation available for multiple languages (see the links in the footer of the Wikipedia page)
  • More "resistant" to GPU; thanks to details of its design internal. The authors of bcrypt did so voluntarily. They reused Blowfish because it is based on an internal RAM table that is constantly modified during processing. This complicates the life of those who want to accelerate bcrypt with Gpus, because Gpus are not good for accessing memory in parallel.
  • Standard output includes salt and iteration count, greatly simplifying storage, which is only one string.

Disadvantages:

  • Fixed output size: 192 bits.
  • Despite forcing Gpus, it can be optimized with FPGA: Modern FPGA chips have built-in RAM blocks that are convenient for running bcrypt in parallel on the same chip. It’s already been done, including.
  • The input password is limited to 51 characters. For larger passwords, someone would have to combine bcrypt with a hash (calculates the hash password, and uses the result with bcrypt). Combining cryptographic primitives has risks, so this is not recommended for general use.

scrypt

The scrypt is much newer (designed in 2009), based on PBKDF2 and a stream Cipher called Salsa20/8. Anyway, these are just tools around the core that focus the strength of the scrypt, which is the RAM. The scrypt was made to use a lot of RAM (it generates a few pseudo-random bytes, and reads them repeatedly, in a sequence also pseudo-random). "Too much RAM" is a hard thing to parallelize. A basic PC is good at accessing RAM, and will usually not try to read simultaneously from RAM dozens of bytes with no apparent relation. A GPU or FPGA attacker may want to do this, but will have difficulty.

Perks:

  • A PC, that is, what the defender will use to apply the hash normally, it is one of the best (if not the best) platform to operate scrypt. The attacker no longer has an advantage by investing in GPU or FPGA.
  • One more adjustment option: memory usage.

Disadvantages:

  • It is still new (my "personal rule" is to wait at least 5 years of exposure -- but, of course, it is good that other people use the scrypt in production, to give greater visibility).
  • Not so many implementations available for the various languages.
  • It is unclear what CPU and RAM ideal to use. For each pseudo-random access the scrypt needs to compute a hash, one miss cache spends 200 cycles, a call of SHA-256 spends 1000. There is room for improvement in this aspect.
  • One more option to adjust: how much memory to use.

Iterated And Salted S2K openpgp

I am mentioning this because you will use it if protecting files with password using Gnupg. This tool follows the openpgp format, defining its own functions of hash, calls "Simple S2K", "Salted S2K" and "Iterated and Salted S2K". Only the third can be considered "good" in the context of this response. It is defined as a long string hash (configurable up to 65 megabytes) and consists of repeating an 8-byte salt and password.

Broadly speaking, Openpgp’s "Iterated And Salted S2K" is decent, similar to a PBKDF2 with fewer options. You will rarely find this implementation outside of Openpgp.

"crypt" of Unix

Current Unix-like systems (such as Linux) use iterated variants and with salt of function crypt(), which is based on good functions of hash, with thousands of iterations. That’s fairly good. Some systems can also use bcrypt, which is even better.

The "ancient" crypt() function, which was based on block Cipher DES, nay is good:

  • It is slow in software, but fast in hardware, and can still be accelerated in software when computing multiple instances in parallel (technique known as SWAR or "bitslicing"), which is advantageous for the attacker.
  • It’s still very fast with 25 iterations.
  • Its salt is 12 bits, implying frequent reuse.
  • Truncates the password to 8 characters and even truncates the high bit, transforming into pure ASCII (7 bits).

The new variants, which are currently in use, are OK.

Functions of hash bad for passwords

Everything else, especially homemade solutions that people insist on creating, assuming that "secure encryption" is "throwing away every cryptographic operation or not that can be thought of". Behold this question as an example. The principle in it seems to be that the complexity with the mess of the instructions will confuse attackers. In practice, the developer will always be more lost in his own creation than the attacker.

Complexity is bad. Homemade solution is bad. Novelty is bad. Remembering this you will avoid 99% of the problems of hash of passwords, and security in general.

Hash password in Windows used to be terribly ghastly, now it’s just bad (MD4 without salt and iterations).

Key derivation

So far we have considered the question of hash of passwords. An upcoming problem is the transformation of a password into a symmetric key that can be used for encryption. This is called key derivation and it’s the first thing you do when you encrypt a file with a password.

It is possible to make elaborate examples of hash that are safe to keep a token of validation, but which are very bad for generating symmetric keys; the opposite is possible in the same way. These examples are however artificial. For practical cases as described above:

  • The exit of a hash password is acceptable as symmetric key, after being truncated in required size.
  • A key derivation function can serve as hash password, provided that the derived key is long enough to avoid "generic pre-images" (the attacker is lucky enough to find a password that gives the same result). 100-bit output should be sufficient.

As a matter of fact, PBKDF2 and scrypt are key derivation functions, not hashing -- and NIST approves PBKDF2 as a key derivation function, not explicitly as Hasher. (but with just a little bit of hypocrisy, you can read the NIST material in order to understand that PBKDF2 is good for hash password).

Still, the bcrypt is actually a block Cipher (the password processing part is the "key Schedule") which is then used in CTR mode to produce three 192-bit blocks of pseudo-random output, making it a kind of function of hash. bcrypt can turn a key derivation function with a light operation, using the block Cypher in CTR mode to generate more blocks. But as usual, we don’t do home patches again. Fortunately 192 bits are more than enough for most purposes (e.g., symmetric encryption with GCM or EAX only need 128 bits in key).

Additional considerations

How many iterations?

The more the better! This slow-and race-Salted It’s a close dispute between attacker and defender. You use many iterations to leave the hashing more difficult to all. To increase security, you should keep the number higher that is tolerated on the server, considering the other things it should do. The louder the better.

Collisions and MD5

MD5 broken: it is very easy to get several pairs of different inputs with the same output value. These are the collisions. Meanwhile, collisions are not a problem for hash password. Hash of passwords has to resist the pre-images, noncollisions. Collisions are pairs that give same exit unrestricted, while in the hash of passwords the attacker has to find a message that gives a determined exit, which he did not choose. This is quite different. As far as we know, the MD5 is about as strong as ever in relation to pre-images (there is a theoretical attack which is still far from being viable in practice).

The real problem with MD5 is that its use is very common for passwords, it is very fast and has no salt. But the PBKDF2 used with MD5 would be robust. You should use SHA-1 or SHA-256, but because of "public relations". People get nervous when they hear "MD5".

Generation of salt

The fundamental objective of salt is to be the most single possible. Whenever a salt is reused, it can potentially help an attacker. For example, if you use the user name as salt, an attacker can build Rainbow Tables using "admin" and/or "root" as the table would serve many places that have users with these "names".

Similarly, when a user changes the password, the name remains, leading to the reuse of the salt. Old passwords are always targets of value, because users tend to reuse them in many places. (it is always reported that it is a bad idea, everyone knows, but keeps doing it because it makes life easier) In addition, people tend to generate sequential passwords. If the old password of Alaor is "Senhasecreta37", well able that the new one is "Senhasecreta38" or "Senhasecreta39".

The "cheap" way to get Salts unique is to use randomness. If you generate your salt with random bytes of a safe generator that their OFFER THEM, (/dev/urandom, CryptGenRandom()...) you will have values of salt "single enough, "if they are 16 bytes for example, to never see in life a collision of salt.

The UUID is a standard way to get "unique" values. Remember that UUID "version 4" uses randomness (122 bits) as mentioned above. Several frameworks offer simple functions to generate UUID on demand, and they can be used as salt

The salt has to be secret?

The salt was not meant to be secret, otherwise it would be called key. You don’t need to disclose salt, but if you need to (using a hash on the client, for example), do not worry. The salt exists only to be unique. It is nothing more than selecting a hash function among many.

Pepper

Cryptographers cannot leave a metaphor quiet. They need to extend them with more analogies and puns (salt means "salt", Pepper means "pepper"). If you use a Pepper in its hash function, you are using a different kind of cryptographic algorithm; you are calculating a message authentication code (MAC) about the password. MAC key is your Pepper.

"Spice up" makes sense if you have a secret key that the attacker is not able to read. Remember that we use the hash why we believe that an attacker can get a copy of the server database or even the entire disk. A common case is a server with two disks on RAID 1. A disk fails, by having the board burned. It happens all the time. The sysadmin change the disk, the mirror is redone, and nothing is lost thanks to the magic of RAID 1. Well, but the old disk doesn’t work anymore, and sysadmin no longer has an easy way to wipe its contents, so it just discards the disk. The attacker finds the puck in the trash, switches the sign, and look at that! It has a complete copy of the system, with database, configuration, executables, OS...

To the Pepper work in such a case, you need something more than a PC with disks; you need a hardware security device (HSM). Hsms are expensive, both in economic and operational terms, but with an HSM, just use the Pepper and process passwords with a simple HMAC (for example with SHA-1 or SHA-256). This will be much more efficient than bcrypt/PBKDF2/scrypt and its complex iterations. Besides, wearing a HSM makes you look very professional in a Webtrust Audit.

Hashing client-side

Like the hashing is purposely "expensive", it would make sense in a client-server architecture to use the client’s CPU. After all, when 100 clients connect to one server, collectively they have much more processing power. For this, the server needs to send the salt for the client. This implies another round-trip data, which may or may not be easy to add in specific cases. In a web context it is more complicated, because javascript has always been more "anemic" to use the CPU. In a context of remote secure password (SRP), the hashing has to happen anyway on the client side.

Completion

Use bcrypt (some are against). PBKDF2 isn’t bad either. If you use scrypt, you will be "slightly ahead" with the risks implied by this expression; but it is good for the progress of science. Be a "crash dummy" is an honorable profession!

This is an adaptation of excellent response given by Thomas pornin on the website Information Security, which is also part of the network Stack Exchange.

  • 18

    As for slowing down the hash, I recommend a Sleep in the random time thread, because of Timing Attack

  • 14

    @Miguelangelo The hash function is to prevent attacks offline, the attacker who got a copy of your BD will not also do the sleep "as a matter of honor, "he will remove it so he can test passwords as soon as possible... Best to keep the work factor as high as possible, otherwise you slows down and your opponent doesn’t. P.S. I agree Attack timing is a serious question, but I don’t know if it applies to password hash. That question on security.SE discusses the subject more deeply.

  • 5

    @mgibsonbr: really, Sleep should be done regardless of the hash being calculated or not. Ideally, all requests for a login page should have the same duration, regardless of the credentials being valid, the hash and any other factors.

  • The link in the conclusion "there are those against" is broken, do you happen to know its content so that we look for an alternative source? By the way, this link was not in the original response in security.SE, and although I don’t know its content I see arguments against it.

  • 2

    @mgibsonbr There are other tweaks I’ve made in between too, it’s an adaptation, not a 100% translation. This link you passed also refers to the same article that "Died", I think for now I will use it as a substitute, because it mentions some points. I’ll check the webarchive first if we have the original. UPDATE: updated with the Wayback Machine archive

  • Desvantagens: Tamanho fixo de saída: 192 bits. >>> I swore every hash had this property! + 1

  • 5

    Congratulations on the gold medal "Great response" of the 100 votes in favor, the first on the site. :)

  • 5

    @Grateful Victorstafusa, but this one we have to consider that the very significant work was by the original author ;)

  • 1

    If I were to summarize PBKDF2, what exactly would it be? When creating a style hash <algoritmo>$<iterações>$<salt>$<hash> would be: pbkdf2$100000$16$kXNppmR0lbEoxRYXBtozKLv6KnAGQ== ?

  • 1

    The definition and requirements of PBKDF2 are briefly described in https://en.wikipedia.org/wiki/PBKDF2. A PBKDF2 implementation of PHP 5.5, for example, gives the key derivation in this 160 bit format: 120fb6cffcf8b32c43e7. How to store the other parameters in string, as in your example, depends a lot on how the values will be used.

  • @Bacco updates the response to include the Argon2 and the information that Scrypt is vulnerable to cache-timing.

  • The article of criticism of bcrypt: http://www.unlimitednovelty.com/2012/03/dont-use-bcrypt.html?m=1

  • @Inkeliz I’m going to see an hour like this to give a "general", I need to set aside some time for this.

  • Like "PBKDF2 with SHA-1", actually means "PBKDF2 with HMAC with SHA-1".

Show 9 more comments

103

If I hash passwords before storing them in my database, it is enough to prevent them from being retrieved by someone?

The purpose of the hashing is to make it difficult for an attacker who has already obtained access (read) to their database to discover the original passwords. Because to gain online access (i.e. log in as a user) it is not enough to present the hash as credential: it is necessary to present something that - after hashed - is identical to the stored hash. A good hash algorithm should therefore be resistant to pre-image attacks (i.e. given a hash X, find a string Y such that hash(Y) = X).

Other desirable features of a hash function are the slowness (increases the time spent on an exhaustive search; important if - as is common in passwords chosen by the user himself - the entropy the passwords are low) and the uniqueness (even though two users have the same password, their hashes are distinct - it prevents a single attack from discovering passwords from multiple users). For good algorithms that meet these requirements, see reply by @Bacco.

I’m only talking about the recovery directly from the database, and not any other type of attack, like Brute force on application login page (...) Any form of hash will not stop these attacks.

In fact, a well-made hash will make this scenario specific at least as safe as against an attack offline - because it puts a limit lower than the time needed to test each candidate for password. There are better ways to protect against attacks online (limit the number of login attempts in a given period of time, increase the wait and/or block the user after N failed attempts, etc.), but even in the absence of them the hash must protect against brute force attacks.

I worry about making it difficult or even making it impossible to obtain the original passwords if the BD is compromised.

As already mentioned, the big problem of protecting a password is its low entropy: there are sophisticated algorithms to generate good "password candidates", able to "guess" the vast majority of passwords as chosen by ordinary users, but without wasting time exploring unlikely passwords (such as pqzrwj). When combined with personal data of the user in question, whether obtained from public records or from the user’s own data (this second outside the scope of the question, since it is a scenario similar to that of a keylogger), the discovery rate becomes even higher.

For this reason, if an access credential is derivative of a password, nothing you do will make it impossible for someone to test password candidates until you find the real one. No matter how complicated your derivation process is (at first I thought to suggest SRP or HOTP/TOTP to help protect the original password, but I realized it would be useless), when it is assumed that the attacker has access to all its parameters (algorithm, salt, Pepper, etc) and expected result (hash, key, etc.), it can recover your password by brute force.

Thus, make impossible getting the original password is unfeasible, only remaining hinder (what is done through an algorithm - not necessarily a hash - slow). To make the scheme foolproof, only preventing the attacker from gaining access to confidential information in the first place...

  • Additional detail: I refer to "passwords" in the sense of "text to be memorized user"; if a system uses randomly generated passwords and sufficiently long (256-bit entropy, or 512 in the presence of quantum computing), then even the simplest and fastest hash - like MD5 - will be enough to prevent in practice the attacker from discovering a previously unknown password (MD5 is vulnerable to collisions, but this is not relevant here - its pre-image resistance is what matters, and this has not yet been contested).

    The reason is that, although theoretically one can still test candidates to find the right one, the number of candidates is so large that it is not possible to do it in practice: even diverging all the resources of the planet (and beyond) for this task, the probability of finding the right candidate would remain negligible.

    However, this type of password is often too complex to store in memory, so the only practical way to use it is by writing it somewhere/keeping a copy in a file/using a password manager. In this context - since it is necessary to give additional protection to recorded data - this credential is more commonly called "key", not password, and has very different security characteristics from what we call the day-to-day password.

What additional concerns would prevent access to passwords? There are better ways to hash?

The "slowness" of the hash is up to you: if it’s okay to wait 1 minute before logging in, you can choose a hash that takes 1 minute to compute on a "top-of-the-line" hardware. A dictionary attack/brute force at 1 try per minute would hardly go far... In the end, it’s a balance between confidentiality and availability.

And as for further measures, an "in-depth defense" is important (e.g.: use a hash algorithm that uses at the same time a) a database value; b) a value in the file system; c) a value present only in the memory - reported by the operator during server start/reset) and the reinforcement of your application/server to prevent attacks that would give access to this data. Just remembering that the higher the safety requirement, the higher the cost - then weigh the risks and consequences before "making every effort possible" or aiming for a "safest possible" system, because that would hardly be the most sensible attitude (e.g.: item "c" would require manual intervention whenever the server is restarted).

78

phpass - the secret of Wordpress

  • "How to hash passwords securely?"

When I had to answer that question in practice, a few years ago, I went to spy on the wordpress code. My thought was: if the whole world massively uses Wordpress, and we are no longer hearing of attacks due to security vulnerabilities in it... then it means that their solution is "bulletproof"...

And it was then that I plunged into the code of Wordpress. I confess that I was horrified by what seemed to me a mess. But I also surprised, stupefied, with the fact that Wordpress does not use Session - This was a great learning, and I really admired that. Anyway, I was unlocking the thing until I extracted from there the code that deals with authentication / authorization - including the hash generation. I put it all together in a class, I used it in some projects, and I recently turned it into a package to the Laravel.

Who cares, the code is on Github.

However, the crypto heavyweight is carried out by "Portable PHP password hashing framework" (phpass).

Knowing only the necessary and sufficient about salt and keys, but even without knowing the details of deep encryption, I know I’m offering my clients a solution "top notch" - as safe as the one that is effectively used by Wordpress heavyweight! :-)

Practical Examples

The use of phpass to generate a hash of a password is easy:

$hasher = new PasswordHash();
$hash = $hasher->HashPassword( $senha );

Remembering that it is so safe when generating Wordpress hash.

Another cool thing is that every time the function is executed for the same password, one hash different is generated! Because of this, it is not possible to check if a password is correct

if ($hash_armazenado_no_bd == $hasher->HashPassword( $senha ))

Even so, to check if a password is correct is very easy too:

$hasher = new PasswordHash();
$correta = $hasher->CheckPassword( $senha, $hash );

In the above code, the variable $hash must contain the hash the user’s password, previously saved in the database, and obtained from it.

Thus, thanks to the open source, with the least possible effort to satisfy a demanding safety requirement, making the system as safe as possible at the lowest possible cost! ;-)

To learn more: How to Manage a PHP application’s users and passwords

  • 3

    I agree that phpass is a decent library, but knowing some details of how it works is still important to build a "system as safe as possible". For example, if this package is used in a PHP before 5.3.0 (and without the Suhosin patch) all the security guarantees go down (because in the absence of bcrypt it falls on iterated MD5). The algorithm work factor (as important as "salt and keys") is also configurable by this package ($hash_cost_log2), and if misused also makes it insecure.

  • 4

    By the way, the reason why every time the function returns a different hash for the same password is because the bcrypt output includes the salt (randomly generated) and the work factor. The code if ($hash_armazenado_no_bd == $hasher->HashPassword( $senha )) does not work because it re-generates these parameters, but what the CheckPassword does underneath the scenes is basically the same thing - only implemented in the correct way for the algorithm used.

  • 3

    But it will be safe enough for us to rest?

  • 3

    Hello! I just saw it now and it helped me VERY much! But I still have a doubt... I use this checkpassword like this: include_once('DataAccess.php');&#xA; $db = new DataAccess();&#xA; $stored_hash = $db->getPassword();&#xA; $correta = $hash->CheckPassword( $user_pass, $stored_hash );&#xA; &#xA; if ($correta == true){ ... } But you keep saying the password is wrong... Any idea?

52

In short:

That accessing the Hash can faster or slower access the password.

There are sites called Rainbow Tables, which are Bss of with hashs pre calculated.

In the case of SQL / Mysql / ... has a "Hashcat" tool that helps to compare hashs.

If you want maximum security, use tokens (create randomness to the password) or digital certificates.

Take a look at the following sites:

Rainbow Tables

https://www.freerainbowtables.com/

Hashcat

http://hashcat.net/oclhashcat/

41

Use Salt.

That is, an alphanumeric sequence that only your system knows and that you add to passwords before hashing. That alone ends most attacks.

With Salt used correctly even the MD5 can be used with some ease. Even better if you use a random and unique salt per user, stored separately from password hashes.

Without salt, even the most complex algorithms are susceptible to brute force attacks with Rainbow Tables, for example.

39

Roughly I (particularly) create a "salt" to concatenate with the hash, a kind of "seasoning", for example;

$senhaDoUSuario = 'minhaSenhaFraca';
$salt = 's697er3z1680e6r87er2g35g6514'; //(catwalk)
$pass = sha1($senhaDoUSuario . $salt);  //Coloco sal na criptografia

echo $pass;
    
// Senha sem "sal", apenas criptografada
// 7cdbf96f878b6816abdecd3f564a897ab393cecd
    
// Resultado com sal e criptografada
// 7cdbf96f878b6816abdecd3f564a897ab393cecds697er3z1680e6r87er2g35g6514

Even if the badly-focused user has access to the encrypted password, he will not be able to use software or applications that break the encryption.

English salt translation: salt

It would be ideal to also save the salt in some environment variable to check.

For verification you would make the comparison of hashes (user variable concatenated with the salt as in the example above).

  • 2

    @Fabio added a way to do the check.

  • @Cmtecardeal, when you add the fenced code block, care not to add unnecessary indentation.

18

Good hash functions for passwords

PBKDF2

PBKDF2 comes from PKCS#5. It is parameterized with iteration counter that starts with 1 and has no upper limit, an arbitrary salt with no size restriction, and the required output size. (PBKDF2 generates a configurable output size) and a PRF. In practice PBKDF2 is always used with HMAC, which is built in turn on a hash function as well. When we say "PBKDF2 with SHA-1", it actually means "PBKDF2 with HMAC with SHA-1".

Perks:

It’s been defined for a long time, and it’s still "unharmed". It is already present in several frameworks (like .NET, for example). Highly configurable (although some implementations do not allow the choice of hash, such as .NET which uses SHA-1 only). Approved by NIST (note the difference between hashing and key derivation; read below). Configurable output size (read below).

Disadvantages:

It is CPU-only intensive, and much more GPU-accessible (the server is usually a normal PC, for usual tasks, while the attacker can invest more in specialized hardware and take advantage of it). You have to adjust the parameters on your own (generate and manage salt, number of iterations etc). There is a standard encoding for PBKDF2 parameters but it uses ASN.1, so people avoid it whenever they can (ASN.1 is complicated for non-intellectives).

  • 4

    I believe I should use PBKDF2 in last case. Bcrypt, created much longer ago, can still be better than PBKDF2. Today, there is the Argon2i.

17

If possible use the Argon2, he was the winner of PHC, has more adjustment options compared to PBKDF and Bcrypt.

He can you adjust:

  • Iterations: time cost, more operations will be done.

  • Memory: memory cost, more memory will be required.

  • Threads: cost per parallelism, more processes will be initiated.

You have two options:

  • Argon2i: safe against side-Shield attacks, access to memory will not reveal traces of the current entry. This is ideal if you are using shared or third-party access hardware, such as on a public cloud server.

  • Argon2d: totally vulnerable against attacks side-Channel, like Scrypt. However, it ensures better security against GPU/FPGA, since it is not possible to pre-compute the data. This is ideal if you are on a secure platform or have no concerns about side-Channel, as in an Android app.

  • Argon2id: is more recent and is the junction of the two above, being a solution between the two cases. It ensures better protection against side-Channel, compared to Argon2d. In addition, it has better GPU protection compared to Argon2i. This option is the best for all cases, where there is a concern with both attacks.

13

A very simple example. I take a word (which could be a password) and I take MD5 from it:

$ echo "teste" | md5sum
1ca308df6cdb0a8bf40d59be2a17eac1  -

If you search the string "1ca308df6cdb0a8bf40d59be2a17eac1" in Google, you immediately discover that it is related to the word "test". Attackers use the so-called "Rainbow Tables" which are collections of Md5s or SHA-1’s (and other so-called safe hashes) with known words.

Therefore, a hash is not enough to protect a password, because users use weak passwords and all strong password that has already "leaked" somewhere, is being added to "Rainbow Tables".

The basic correct password hash technique is to "salt" the password with additional data. For example:

echo "28234892423394 teste" | md5sum
2409fe72ab7c6d335b35dc9e7090951d  -

This MD5 you do not find in Google. Where you store the password, you would store the two information: "salt" and MD5:

28234892423394 2409fe72ab7c6d335b35dc9e7090951d

"Salt" does not need to be encrypted and is useless to try to find the original password from the hash (assuming the hash is good, the MD5 is no longer considered good). When the user enters the password, you concatenate it with "salt" and recalculate the MD5. If it matches the stored one, it proves that the user knows the original password.

But you won’t do any of this manually. Every language has cryptographic functions like bcrypt() that do everything automatically. The other answers also extensively explain other techniques such as HMAC, etc.

  • 2

    This does not solve the problem, still remains extremely fast an exhaustive search. Salt does not delay computation, time remains the same, difference is unpredictable or nonexistent. This solves a problem and a bigger one will continue to exist. Bcrypt is totally different from MD5, HMAC is also not made for passwords, so there is PBKDF2 that uses HMAC, but several times, to try to delay exhaustive searches. Exhaustive searches work with passwords because they themselves have low entropy, including the Hashcat.

  • 2

    Also, if you are adding salt as a prefix you may be overriding it. Since in this case, if salt is the same size as the MD5 block, you can only compressed once. That is, you compress the prefix and then just search the password and compare the compressed value, gaining even more performance. MD5 uses Merkle-Damgard, which would be "equal" to CBC, so you can compress salt once. : D

Browser other questions tagged

You are not signed in. Login or sign up in order to post.