37
How are prime numbers an important part of some encryption systems? How this process works and what part prime numbers come into?
37
How are prime numbers an important part of some encryption systems? How this process works and what part prime numbers come into?
33
There are two types of encryption: the symmetrical (where both participants share a secret) and the asymmetrical (where one has a secret that the other does not possess). The symmetric makes no use of prime numbers in general (it is based on Discrete Probability), while the asymmetric makes extensive use of them (it is based on Number Theory, such as pointed out by Alexandre Lara). The reason for this is that while the symmetric can base all its security on the shared secret, the asymmetrical cannot do this, because one side of the communication does not have access to this secret, and therefore cannot do operations with it. It is necessary to seek this security somewhere else.
Often this security is found in intractability of certain problems of mathematics. One of them, as also mentioned, is the difficulty of factoring large numbers efficiently. Another is to compute the discrete logarithm of a number any module a prime number (e.g.: if I raise 5
to x
and do the rest of the division by 7
, the result will be 2
; how much is x
?). And these problems typically involve prime numbers, since several theorems state that operation X is easy/difficult/possible/impossible if its operands are prime.
I still know little about it, but I’ll give you two very simplified examples that demonstrate the importance of cousins in cryptography: Diffie-Hellman and the RSA. In the case of Diffie-Hellman, the prime number used in the calculation need not be secret, it only provides the properties necessary for the desired calculation to be possible. At RSA, cousins need to be chosen at random and kept secret.
Let’s say Alice and Bob want to communicate confidentially, using symmetric encryption, but they have no common secrets. To simplify, let’s say that your concern is only with the confidentiality of the communication, not with its integrity (i.e. they fear that someone will monitor the communication, but not alter it). You both know a prime number p
very large and a value g
amid 1
and p-1
. A way for both to "produce" a shared secret with the aid of these numbers is as follows:
Alice picks a number a
and calculates the rest of the g
raised to a
for p
(i and.. A = g^a (mod p)
). Bob also picks a number b
and calculates B = g^b (mod p)
. Both exchange these values A
and B
among themselves. By the laws of exponentiation, Alice can calculate B^a = (g^b)^a = g^(a*b) (mod p)
. Bob can also calculate A^b = (g^a)^b = g^(a*b) (mod p)
. That is, they both calculated the same value, and this value can be used as the shared secret needed to use symmetric encryption.
Okay, but why is this value a "secret"? Well, only Alice knows a
and only Bob knows b
. An attacker who monitored communication would only see A
and B
. How to calculate g^(a*b) (mod p)
from these values? Well, if A = g^a
and the attacker knows A
, g
and p
then he just has to find one a
such that g^a = A (mod p)
and then do B^a
, as Alice did. But that is just the discrete logarithm mentioned earlier! If the domain were the Real numbers, computing the logarithm of a number is trivial (log_a(b) = ln(b)/ln(a)
), but in discrete numbers this is much more difficult - at least when this operation is done module a prime number.
Importance of cousins for DH: Remember that parameter g
previously chosen? It occurs that all numbers that are primes with p
can be "achieved" from some power of g
. Using 5
and 7
as examples, we have to:
1 = 5^0 = 1 (mod 7)
5 = 5^1 = 5 (mod 7)
25 = 5^2 = 4 (mod 7)
125 = 5^3 = 6 (mod 7)
625 = 5^4 = 2 (mod 7)
3125 = 5^5 = 3 (mod 7)
15625 = 5^6 = 1 (mod 7) <-- começou a repetir; mas todos os nºs entre 1 e 6 já aparecerem
78125 = 5^7 = 5 (mod 7)
...
But if 7
is prime, so the "prime numbers with 7
" will be all numbers smaller than 7
(except zero, and with the addition of one). If the chosen module was 12
, for example, it would have far less results:
1 = 5^0 = 1 (mod 12)
5 = 5^1 = 5 (mod 12)
25 = 5^2 = 1 (mod 12) <-- já?!
125 = 5^3 = 5 (mod 12)
625 = 5^4 = 1 (mod 12) <-- sigh... vamos tentar outro?
1 = 7^0 = 1 (mod 12)
7 = 7^1 = 7 (mod 12)
49 = 7^2 = 1 (mod 12) <-- argh!
1 = 3^0 = 1 (mod 12)
3 = 3^1 = 3 (mod 12)
9 = 3^2 = 9 (mod 12) <-- agora vai!
27 = 3^3 = 3 (mod 12) <-- d'oh!
1 = 9^0 = 1 (mod 12)
9 = 9^1 = 9 (mod 12)
81 = 9^2 = 9 (mod 12) <-- aí já é perseguição...
Not every compound number is so "bad" (think), but the fact is that if from a generator g
only a small subset of numbers smaller than p
, then the chance for Alice and Bob to choose a unique and secret key greatly diminishes (as the opponent may not know a
and b
, but he’ll end up finding another number x
such that A = g^x
or B = g^x
with very little effort). Hence the importance of a prime number and large pro protocol (has so many x
which could result in g^x = A
you can’t test them all).
Let’s say you code (encounter) a text to be transmitted in confidence in the form of a number M
. You want to send this text to me, but we don’t share any secrets. However, I know two cousins p
and q
large that no one else knows (i.e. no one knows that I picked out these two among the sea of known cousins), I multiplied the two resulting in m = p*q
and published this m
, along with a number e
any (that e
does not need to be cousin, but needs to meet certain simple requirements). You calculate then C = M^e (mod m)
and sends C
for me.
Well, what I care about is not C
, and yes M
. And for the attacker, it’s all about M
. But how to do reverse operation, calculate M
from C
, m
and e
? If it were in the context of Real numbers it would be easy: just do the "root e
-twentieth of C
". Only when you did the rest of the division of M^e
for m
you "threw away" most of the number - C
is small, much smaller than M^e
, and perhaps not even a "perfect square" (metaphorically speaking). You can’t just calculate this root, and even if you do, you end up getting a different result than M
...
Fortunately, there is a theorem that says m = p*q
then there’s a number f = (p-1)*(q-1)
that can help me calculate M
: occurs that the number e
previously chosen has an inverse module f
which is easy to calculate, and if I have that number - which I will call d
- then I just make a new exponentiation to find the result I was looking for:
C = M^e (mod m)
M = C^d (mod m)
How I know my cousins p
and q
, it’s easy for me to calculate d
. But who does not know these numbers, and can not factor m
to discover them (because m
is a very large number) does not have the "tools" necessary to achieve the same result. It is not possible to test all numbers 0 <= d < m
possible, and has nothing to facilitate this calculation without the knowledge of the cousins involved. So that while "factoring" is a difficult problem, decipher C
from the public information submitted (m
and e
) it will also be difficult.
Importance of cousins for RSA: if the number m
is chosen at random, factoring it should not be too difficult - since it is quite likely that it has many small factors. And whenever a small factor is found, one can divide m
by this factor and now the problem of factorization is applied to a much smaller number. In the end, one can use the brute force to factor what is left over. And once the prime factors are known, one can calculate the function phi(m)
and based on it find the reverse d
of e
capable of decrypting encrypted data.
Using a number that only has two prime factors, it is guaranteed that the factorization process will be as difficult as possible for those who do not know the factors, but the whole easy scheme for those who know them. Otherwise it would be necessary to use much larger numbers, which would eventually make the whole process difficult for both.
As I have already commented on your previous question about quantum computing, several of these problems that are intractable today would become treatable if a quantum computer with sufficient capacity were built. The security of the RSA would be totally broken (because the Shor’s algorithm would allow factoring large semiprimes in polynomial time) and Diffie-Hellman also (both the "classic" - the one I described in the answer - and the Elliptic Curves - which is essentially equivalent but much more efficient - because the same algorithm also allows calculating the discrete logarithm in polynomial time).
To lattice-based cryptography (or were they "lattice"? ) is considered resistant to quantum computing, but I can’t tell you whether it is based on some property of prime numbers or not. Anyway, I mentioned here to emphasize that what gives security to asymmetric cryptography are intractable problems, so the importance of prime numbers today is due to the amount of open problems involving this type of number. And given the difficulty of building efficient quantum computers, it is likely that this importance will continue for a long time.
15
The first thing you should keep in mind is that cryptography is an area extremely related to Number Theory.
There is a concept in number theory called integer factorization, in which it states that every compound number (those are not primes) can be represented in prime factors.
If you take 2 large prime numbers and multiply them, you will get a very large integer (non-prime).
The applications of this in cryptography are in the difficulty and time it takes to factor a very large integer into prime factors. This is because there is no magic formula, you have to test all combinations of numbers down to at least the square root of the number you want to get. For example, let’s say we have the number 36:
36 / 2 = 18
18 / 2 = 9
9 / 3 = 3
3 / 3 = 1
I factored the number 36 into 2 * 2 * 3 * 3, where 2 and 3 are prime factors.
RSA cryptography, for example, is quite dependent on this concept of Fatorization in prime factors. Summarizing this type of cryptography: The public key consists of multiplying two large prime numbers, while our private key consists precisely of the prime numbers used.
Right, but what’s the point of that?
An encryption algorithm doesn’t necessarily mean that its goal is to be unbreakable. In theory, for example, by brute force it is possible to break almost all (or perhaps all) existing encryption algorithms. A good encryption algorithm aims to ensure that the information will remain secure long enough until its value is of no use to a malicious person .
In the case of RSA, users can use the public key (consisting of the product of large prime numbers) to encrypt the information, while you will use your private key (the prime numbers used in multiplication) to decrypt. Anyone else who wants to get this information will have to factor the number, and that would take a gigantic amount of time.
You can read more on Wikipedia about the RSA encryption algorithm.
1
Prime numbers have importance not only in cryptography but in numerous other computer science topics, because a prime number can form a "Galois field".
For example, a field based on number 7 and with generator 2 (but any number between 2 and 6 would serve as generator):
2 * 0 -> 0
2 * 1 -> 2
2 * 2 -> 4
2 * 3 -> 6
2 * 4 -> 1 (resto da divisão de 8 por 7, ou mod 7)
2 * 5 -> 3 (mod 7)
2 * 6 -> 5
2 * 7 -> 0
2 * 8 -> 2
...
That is, multiplying the generator by "n" times, we can get any field value, that goes from 0 to 6 (7 elements).
Assuming you have a message composed solely of numbers between 0 and 6, a very simple encryption algorithm could be created based on this property: encrypt by multiplying by 4, and decrypt by multiplying again by 2. Like 4x2=8, and the rest of the 8-by-7 split is 1, encrypt and decrypt then has the net effect of multiplying by 1, which is the original message, so it is "proven" that the original message is recoverable.
A "field" based on a non-prime number is not a field, but a ring, because not every number below the ring size (or order) can be a generator. In a ring of 12 elements, 5 could be a generator (because 5 has no factor in common with 12), but 2 can’t, 3 can’t, 4 can’t, 6 can’t...
Now, if we "generate" the numbers in another way, using exponentiation, for example with generator 2 and field 7:
2 ** 0 -> 1
2 ** 1 -> 2
2 ** 2 -> 4
2 ** 3 -> 1 (mod 7)
2 ** 4 -> 2 (mod 7)
....
We see that generator 2 is no longer suitable because it starts repeating results too early. It generates only values 1, 2 and 4 repeatedly. For a field 7, only generators 3 and 5 serve (if the generating operation is exponentiation):
3 ** 0 -> 1
3 ** 1 -> 3
3 ** 2 -> 9 mod 7 = 2
3 ** 3 -> 6 (mod 7)
3 ** 4 -> 4
3 ** 5 -> 5
3 ** 6 -> 1
3 ** 7 -> 3
...
Note that the number 0 cannot be generated, so the period of the generated numbers is always a unit smaller than the field. If we have to encrypt a message with 6 "characters", field 7 is suitable.
The case is: in a ring whose number of elements is not prime (type 8, 9, 10, 12 elements) THERE IS NO GENERATOR that works with exponentiation. The generated sequence starts repeating itself very quickly, presenting only 2 or 3 different values, or even degrading to a sequence of zeros.
To make an exponentiation-based encryption, see the other answer that explains RSA encryption, which basically works thus: choose a generator and two exponents that "cancel" each other.
1
Because the theory is based on the difficulty in factoring prime number, say extremely long prime numbers, with hundreds of digits.
Cryptography is a science based on number theory. And integers can be decomposed into prime numbers (exception of 0 and 1).
Many algorithms (RSA for example) are created based on this difficulty in factoring prime numbers.
Factoring prime numbers is easy: the factors are 1
and the cousin himself! : ) What is difficult is factoring semicousins - numbers composed by exactly two prime factors, and each one of them very large.
Browser other questions tagged cryptography
You are not signed in. Login or sign up in order to post.
You spoke in cryptography, you appear. + 1 for the answer and the excellent work you’ve been bringing to this community.
– Avelino
applause .... before I didn’t know very well about cryptography, but this little text showed me what 3 pages tried to explain to me.
– CypherPotato
I believe that the most modern algorithms are no longer using the concept of single factorization and with this primes lose their importance in this aspect.
– Motta
@Sorry Motta, I don’t get it. There are algorithms of all kinds, and in fact not all of them are based on prime numbers. In particular the ECC (elliptic curves), the main "substitute" of the RSA nowadays, is not based on factorization, but I can’t tell if it uses prime numbers or not (my knowledge of cryptography is still quite limited).
– mgibsonbr
@mgibsonbr my also but I believe that large primes are no longer used in these new ones. https://pt.wikipedia.org/wiki/Fun%C3%A7%C3%A3o_esponja
– Motta