To determine the strength of a password, you need to analyze how attackers proceed when trying to break a password. This excellent article by Bruce Schneier (in English) describes the state of the art in breaking passwords, either "blind" (i.e. when nothing is known about the user in question) or directed. One also considers the scenario where the attacker has access to the victim’s data (e.g., a copy of his hard drive), but this extrapolates the scope of the question.
Forms of attack
The set of possible passwords is virtually infinite, unless there is a size limit. However, the set of passwords probable is much smaller. Consider an 8-letter password, for example. It is much more likely that the user will choose one that contains a word than simply a random set of characters (type yrhxmmpl
). An attack that tried all 8-letter combinations (208 billion) would have a success rate lower than one that tried, say, all the words of 11 most widely spoken languages (3.2 million).
Determining which passwords are the most "likely" is not difficult: just do a statistical analysis. You can do this with users of your own service, or use data from one of several "leaks" of passwords that always occur around the world (often involving large companies, with many customers, so sampling is significant). By way of example, some conclusions mentioned in the article:
Some passwords are very common, despite their weakness (password
, letmein
, 123456
, etc). It pays to test by a list of, say, 1000 most common passwords before anything else.
- If combined with the 100 most common affixes (see below), this set of 100,000 candidates break almost 25% of passwords.
The most common password format consists of a "root" and an "affix". Suffixes are more common than prefixes. The root is usually a word - perhaps transformed by a few simple rules (e.g..: s -> $
, a -> @
, t -> 7
, etc) or a given name. The affix in general is a number, a special symbol, or both.
- It is estimated that this test is sufficient to break from 55% to 65% of passwords, although the time spent is on the order of weeks.
When using victim’s personal information in conjunction with the above attacks (either by analyzing other known passwords, to identify common patterns, or by taking names, addresses, numbers, etc and adding to root and affix dictionaries before carrying out the attack) the success rate increases slightly, and the time spent on the attack decreases considerably.
Finally, when the dictionary attack proves ineffective, it is still preferable to test for things that if appear with words rather than simply random strings. Random sequences yet pronounceable are more likely to be used as passwords than those that are not.
This is not the only form of attack, of course, but it is the most effective. If a "type" of password becomes very popular, adapting the attack to this type would give a better chance than a generic attack. It should also be noted that the standards presented above derive almost directly from the policies that companies impose in the choice of passwords: if everyone asks for "a letter, a number and a special character", users will choose the simplest sequence that meets these minimum criteria, and exchange experiences and "tips" with each other - leading to a standard that technically meets the requirements but does not provide the desired protection.
Strong passwords
A strong password is simply a password that will not be detected by the default used by the attacker, either because it does not belong to the tested set or because it even belongs, but will be tested last according to the rules. It becomes tempting simply to reject the pattern, but then you just fall into another pattern. Especially because, if your service will reject passwords according to some objective criterion, it is very important inform the user of what these criteria are (to avoid for example the drama of cabbage... :Q). And in doing so, you also inform the attacker what type of password it is useless to try...
But an even more important reason for not creating arbitrary rules for passwords is that we need to take into account the human factor safety, not only the "machine" factor. How already satirized in XKCD and well debated in security.SE, a password needs to be difficult for the machine to guess and easy to remember user. Ignoring this second part leads to practices even more damaging to security than a weak password, such as the habit of annotating the password on a post-it and pasting it on the monitor.
Fortunately, there are ways to produce passwords that are relatively simple to remember and quite difficult to guess - even if the pattern used is known. One is the use of the first letter of each word of a phrase [relatively long], as a password produced in this way is almost indistinguishable from a random sequence[Citation needed].
Entropy
Finally answering the question: the formal method for determining the strength of a password is the entropy (as defined in Information Theory; not to be confused with the concept of the same name in Thermodynamics). Because the password consists of a secret that the owner knows but the attacker does not know, its security lies in the fact that an attempt to "guess" this password (i.e. choose the correct possibility in a set of possible values) has a negligible chance of success. However, it is not enough that the set is large, if some candidates occur much more frequently in this set than the average, it is said that the entropy of it is low.
And it’s good to make that clear: entropy is not a feature of the password itself, but yes of the process that originated that password. The question that needs to be asked is: "when using the X process to choose a password, what is the probability of the Y password being chosen?".
Measuring entropy is complicated, and subject to many mistakes. As an example, a common method for estimating the strength of the password is to check which character sets the components of the password belong to, and to assume that a password of size N belongs to the set of possible strings with N characters of that set. This method is flawed because - assuming a password like Password1$
- it is established that the set size is 95 (uppercase letters, lowercase, digits and ASCII symbols) and the password size is 10, and it is estimated that "65 bits of entropy" (log2 9510), which would be a pretty strong password! But if the calculation is done in another way - say a word in the list of the most common 1000 (10 bits), with or without the first letter capitalized (1 bit), followed by a digit (3.3 bits) and a special symbol (5 bits), in one order or another (1 bit), it reaches 20.3 bits of entropy, revealing that it is too weak a password...
(By the way, which of these two calculations is correct? Well, both! The procedure of generating a random password with 10 ASCII characters will reach Password1$
once in 265, and the procedure for choosing a word from a list of 1000 [containing password
], capitalize or not, and add a digit and a symbol will reach Password1$
once in 220.3. Remember, entropy is process, not password. If there is such a thing as "password strength", I would say that it is the smallest entropy among all the "reasonable" strategies that a user may have employed to choose his password. But this is still subjective.)
Still, you can set lower limits for entropy. An 8-letter lowercase password, for example, will have a maximum of 37 bits of entropy (log2 268). Probably less, but never again. Because a password with this format sluice only 37 bits of entropy. There is much talk that longer passwords are safer than shorter passwords, but this is not true in all situations (pqzrwj
is a more secure password than password1
) - what is true is that in a longer password "fits" more entropy than in a shorter password. Choosing a strong and memorable password with 20 characters is easier than doing the same with 8.
Applying entropy analysis to determine the "strength" of the password
If there are N different ways of calculating entropy, which one is the most "correct", or perhaps the most "useful"? I would say that one can divide the analysis into two cases: one that takes into account the average case, and another that takes into account the worst case.
Suppose they do exist E
strategies that the attacker could use to try to figure out a password. Each strategy represents a specific password generation (choice) procedure. Associated with each procedure, means of measuring/estimating its entropy can also be associated. Some examples of procedures, and corresponding strategy, would be:
P0 - Gerar uma senha aleatória de 8 caracteres.
E0 - Testar todas as sequências de 8 caracteres uma por uma.
M0 - Calcular o tamanho do menor alfabeto que contém a senha, elevado ao tamanho da senha.
P1 - Escolher uma palavra do dicionário e anexar um sufixo numérico.
E1 - Escolher um dicionário, testar todas as palavras desse dicionário combinadas com todos
os sufixos até um tamanho X.
M1 - Escolher um dicionário de frequências de palavras, verificar se a senha contém uma
palavra, qual sua probabilidade de ser escolhida dada sua frequência, e acrescentar
3.3 bits de entropia para cada dígito no sufixo.
P2 - Escolher quatro palavras do dicionário.
E2 - Escolher um dicionário, testar todas as combinações de quatro palavras.
M2 - Escolher um dicionário, calcular quantas combinações de quatro palavras existem,
verificar se a senha é composta de quatro palavras nesse dicionário.
P3 - Pensar numa frase, pegar a primeira letra de cada palavra dessa frase.
E3 - Escolher um dicionário e uma gramática, gerar todas as frases possíveis.
M3 - ???
...
In a worst-case analysis, we would assume that the attacker knows which procedure the users most use, and would simply choose the strategy En
to the detriment of all others. Thus, the entropy of each process to which the password applies would be estimated, and the lowest value found would be considered as "the strength of the password".
The average case analysis is similar, only that the assumption that the attacker will choose the right strategy every time does not exist. Instead, one can assign a probability to each of the strategies considered, and to make a weighted average of all the estimates obtained instead of choosing the lowest value found. Alternatively (and I believe, more correct, although it is only my opinion), one can add bits of entropy to each estimate representing the chance that the attack strategy was used, combined with the chance that the password was discovered according to the same strategy, and continue considering only the lowest value found.
An analysis of what would be more accurate, considering only the average case, the worst case or something in the middle of the way would be too extensive to be exposed here. I suggest the study of Game theory for a better understanding of the most appropriate way to behave in the presence of an opponent who also takes target choices into account when adapting his strategy.
Completion
There is no consensus on how best to choose passwords, nor on how to estimate the strength of a particular password. If you decide to be liberal in the types of password you accept, my suggestion is to keep a list of common passwords and/or words and reject them immediately, reject known user data as part of the password (as quoted in the question itself)and reject passwords whose size is too small. Even so, I fear you will have little success in ensuring an adequate level of security in accepted passwords.
If you decide to be draconian, on the other hand, it is possible not only to ensure a minimum force, but it can be done without antagonizing the user too much. The very suggestion of XKCD (4 randomly chosen words from a 2000 list), for example, guarantees passwords with little less than 44 bits of entropy, which is not particularly high but is significantly higher than the average of freely chosen passwords by the user (speak on an average of 40 bits, but the reality is probably much smaller). With a little creativity, one could think of an even safer and easier to remember pattern. It would therefore be sufficient indicate to the user the mandatory format, suggest a password automatically, and verify if the final password matches the expected format (maybe leaving some margin).
This would not necessarily be well received by users (some will find it too complicated, others will have their own concept - probably wrong - of what is a "safe password" and will bother not being able to use it), so I am not recommending its use, simply citing as the only way I know to ensure a certain accuracy in the measurement of password strength.
Some links about passphrases instead of passwords: this, this and this
– Caputo
Check out the zxcvbn.js library created by Dropbox staff: https://blogs.dropbox.com/tech/2012/04/zxcvbn-realistic-password-strength-estimation/ ------------------------------------------------------------------ https://github.com/dropbox/zxcvbn
– user622
Good question. I don’t understand the subject, but I was going to guess (and partially guess) entropy: http://en.wikipedia.org/wiki/Password_strength (the problem is the "People are notoriously Poor at achieving sufficient Entropy to Produce satisfactory passwords"). :)
– Luiz Vieira
xkcd on the subject
– Bruno Costa