It is a problem that affects anything that does not have constant time, I believe that better explain with examples, maybe someone will answer the most theoretical part.
Suppose this type of code, in PHP:
function tem_simbolo_na_senha($texto)
{
$texto = str_split($texto);
foreach ($texto as $letra) {
if ($letra == "#" || $letra == "@" || $letra == "%" || $letra == "&") {
return true;
}
}
return false;
}
Someone created it because the system requires the user to enter a symbol in the password, this is a #
, @
, %
or a &
, then need to check if the password has a character of this type. If you saw the error in this code then you already know what the timming-Attack is, at its base.
The problem with the above code is that the return
is issued before or after depending on the text. This is the sooner there is a symbol the faster will be returned true
, so consider these two passwords:
$texto = 'yuqkdp69yHryACb778mvZsbFegIBU#';
^
$texto = 'yuqk#p69yHryACb778mvZsbFigIBUh';
^
If we measured the average time for each check to be made we would have:
yuqk#p69yHryACb778mvZsbFigIBUh demorou 1.6998052597046E-6
yuqkdp69yHryACb778mvZsbFegIBU# demorou 5.058741569519E-6
Look at this, here.
In all the tests the yuqkdp69yHryACb778mvZsbFegIBU#
is slower.
This is not just for "what you create", standard language features are also vulnerable, imagine this:
if($usuario === 'inkeliz'){
}
Now note the PHP implementation for ===
:
ZEND_API int ZEND_FASTCALL zend_is_identical(zval *op1, zval *op2) /* {{{ */
{
if (Z_TYPE_P(op1) != Z_TYPE_P(op2)) {
return 0;
}
switch (Z_TYPE_P(op1)) {
//...
case IS_STRING:
return (Z_STR_P(op1) == Z_STR_P(op2) ||
(Z_STRLEN_P(op1) == Z_STRLEN_P(op2) &&
memcmp(Z_STRVAL_P(op1), Z_STRVAL_P(op2), Z_STRLEN_P(op1)) == 0));
//...
default:
return 0;
}
}
Source code
I don’t have much knowledge in C. But as far as I know if both are pointed to the same memory address Z_STR_P(op1) == Z_STR_P(op2)
he will return true
, at once. If not, they must be the same size Z_STRLEN_P(op1) == Z_STRLEN_P(op2)
and then will use the memcmp
.
If we go any lower we’ll see that the memcmp
is:
int
memcmp(s1, s2, n)
CONST VOID *s1; /* First string. */
CONST VOID *s2; /* Second string. */
size_t n; /* Length to compare. */
{
unsigned char u1, u2;
for ( ; n-- ; s1++, s2++) {
u1 = * (unsigned char *) s1;
u2 = * (unsigned char *) s2;
if ( u1 != u2) {
return (u1-u2);
}
}
return 0;
}
According to this source code.
Note the similarity of this code to the first example, PHP, is almost identical! It will give a Return if the u1 != u2
, ie at the first different bit it will send something other than 0.
Therefore:
$usuario === 'inkeliz'
Considering this:
ixxxxxx
It will take longer than this:
axxxxxx
So little by little we get:
iaxxxxx
ibxxxxx
...
inxxxxx
inaxxxx
...
inkxxxx
The inkelia
will be much longer than the axxxxxx
, the memcmp
will abandon the boat only in the last letter while in the second case it delivers in the first difference, in the first letter.
So how can we do something that’s not vulnerable? Keeping the time equal, or closest to it, a time that is unrelated to the actual input.
In the first example we could do:
function tem_simbolo_na_senha($texto)
{
$return = 0;
$tamanho = mb_strlen($texto, '8bit');
for($i = 0; $i < $tamanho; $i++){
$return |= (int)($texto[$i] === "#" xor $texto[$i] === "@" xor $texto[$i] === "%" xor $texto[$i] === "&");
}
return $return !== 0;
}
That would cause:
1 - yuqkdp69yHryACb778mvZsbFegIBU# demorou 6.3819885253906E-6
1 - yuqk#p69yHryACb778mvZsbFigIBUh demorou 6.4213275909424E-6
1 - yuqksp69yHryACb778m#Zs#Fi#IBUh demorou 6.2981128692627E-6
0 - yuqksp69yHryACb778mvZsbFigIBUh demorou 6.1065435409546E-6
1 - yuqksp69yHryACb778mvZsb%igIBUh demorou 5.9833765029907E-6
1 - yuqksp69yHryACb778mvZsbFigIBU& demorou 6.1017274856567E-6
1 - yuqksp@9yHryACb778mvZsbFigIBUh demorou 6.4485788345337E-6
Look at this
That is, the processing time is independent of the input, regardless of where the special character is. It is logical the average times in some cases may be different, but it is not related to input, if it is still influenced will be a much less influence.
'Cause this is better?
The $return
keeps the return, but does not issue it immediately, thus traversing all the rest, whether or not to find another.
The xor
requires that all conditions of the if
are executed if it continues with the ||
it will terminate as soon as the first condition returns true
.
The case of comparison is different, most programming languages have an encryption library, they at all times include a secure comparison function, PHP has the hash_equals
, Golang has the ConstantTimeCompare
, Phyton has the hmac.compare_digest
...
But in general we do:
while(tamanho_string){
retorno |= 'A' ^ 'B'
return retorno
}
In the case of PHP it would be:
function comparacao_segura($input, $segredo){
$return = 0;
$tamanhoSegredo = mb_strlen($segredo, '8bit');
$tamanhoInput = mb_strlen($input, '8bit');
if($tamanhoSegredo !== $tamanhoInput){
return false;
}
for ($i = 0; $i < $tamanhoSegredo; $i++) {
$return |= (unpack('C', $input[$i])[1] ^ unpack('C', $segredo[$i])[1]);
}
return $return === 0;
}
That’s safe because?
Simple, it will run all over string, not giving a return
when a bit is different.
The $return
will always be updated, regardless of whether it is true or not, so it does not create a Branching Timing Attacks and the time will be equal.
If you’ve never seen bitwise operations before, then that would be the working behind:
Datum comparacao_segura("A...", "B...")
, he will:
A = 65
B = 66
Then what you do is a xor:
65 ^ 66
Which is actually:
0100 0010
0100 0001
= 0000 0011
So he makes a ou
(|
) with the value that was previously stored, in the case of:
0000 0000
0000 0011
= 0000 0011
That in the end it will be different from 0
.
If they are equal, the ^
will give 00000000
in all comparisons, hence the or
of 00000000
with 00000000
will be 0000000
and so it will be === 0.
What is the flaw?
It exposes the size of the secret, this is generally not seen as a problem, including the implementation of Phyton has this problem, said in the documentation.
Exists as "correct" this?
Not exactly, due to strlen
, but it is possible to get close:
function comparacao_mais_segura($input, $segredo){
$return = 0;
$tamanhoInput = mb_strlen($input, '8bit');
$tamanhoSegredo = mb_strlen($segredo, '8bit');
for ($i = 0; $i < $tamanhoInput; $i++) {
$return |= (unpack('C', $input[$i])[1] ^ unpack('C', $segredo[$i % $tamanhoSegredo])[1]);
}
return $return === 0;
}
That’s a little mathematical, so the change is:
$i % $tamanhoSegredo
This means that in comparacao_mais_segura('ABCDEF', 'AB')
, for the first time it will be 0 % 2 = 0, then 1 % 2 = 1, then 2 % 2 = 0, then 3 % 2 = 1....
But in general Length timing Attack, as it is called, is not considered such a serious problem, since knowing the size does not know its content.
Another issue is not to do this, which is considered vulnerable by "Branching Timing Attack":
if(algo){
funcao()
}else{
outra_coisa()
}
Suppose the encontrou_cadastro
search for the user’s email in the database (ex. "SELECT * FROM Tabela WHERE Email = ?
").
if(encontrou_cadastro($_POST['email'])){
compara_senha($_POST['senha'])
}
Even if the query time is fast enough, which in itself is no longer, there is still a problem.
This is somewhat logical, the function compara_senha()
will only be called if the user is found, so it is possible that someone, based on time, can find out if an email is registered or not.
In this case if the [email protected]
is registered it will compare the password if it will not perform anything. Therefore, if the "delay" page is because this email exists, if it does not exist, even if it clearly does not say this to the user it is able to know.
This depends on the context to be considered or not a problem. If your website already says "already registered email" when trying to register, then email is not a secret information, so using the above code is not a security problem, email is not secret.
If this is a problem, then do something like:
if(encontrou_cadastro($_POST['email'])){
compara_senha($_POST['senha'], senha_do_banco())
}else{
compara_senha($_POST['senha'], 'hahs-de-uma-senha-idiota')
}
So both cases will use the compara_senha
, nullifying their different ones, assuming they use safe comparisons. : D
There are generic solutions, which is to make a sleep
. This is used we can’t maintain a function in constant time so we do it "in brute force". We have determined that the function should always take 200ms, so no matter what happens it will take 200ms. The problem with that is:
- Setting a fixed time too long exposes a possible Dos.
- Setting a fixed time too short ends with the "protection".
- This only affects remotely, but not locally, so a "local timing Attack" is still fully possible, after all the function itself is not Constant-time.
When we talk about "local timing Attack" the thing gets worse, because there is much more information, it is possible to know where the data is stored (e.g. in RAM, in cache L1, L2, L3) only based on time, each one has different speeds, so this is physical and not software. In this case there are more attacks, which are even the "biggest problem" for cryptography, but of this subject I do not understand so much. :(
Very good explanation, congratulations.
– Don't Panic