UPDATE: Anderson Carlos Woss has a great answer about Crivo, answered a month before mine: /a/206144/64969
I was a member of the Programming Marathon team at my college. We were the Balloon Collectors, from UECE. And cousins were almost constant during the Marathon issues, so we needed something fast, practical and efficient, very efficient.
The algorithm that we used to calculate prime numbers was the Sieve of Eratosthenes. Why was he efficient? Because he calculated all cousins less than or equal to n
in a time of o(n log log n)
; and only needed to do this once, we could store this calculated value at the beginning of the program and use it further. I also need to point out that the Sieve requires memory o(n)
, so it can only be used for small numbers (like 700 million in C, maybe 100 million in Python, for example).
How does this algorithm work? Well, it works by marking the knowingly raw positions/composite numbers of a vector based on the primes it discovers. At the end of the Sieve, all unmarked positions remaining are known prime numbers.
Using the resulting prime vector to calculate the largest prime smaller than n
Suppose I have a vector called numero_eh_primo
boolean. He needs to have at least n + 1
positions to work the Sieve. Obviously the Sieve has already rotated and we have this vector filled as follows:
numero_eh_primo[i]
is true if i
for primo
numero_eh_primo[i]
is false if i
is composed
To find out which prime is smaller than or equal to n
, just go through the vector from n
, decreasing a unit if the position is false and returning the first index found if the value is true.
If I have enough memory, I can apply a memoisation to make future consultations to the same number n
constant order. For this, I need an auxiliary vector called maior_primo_menorque
, that will return the largest prime greater than or equal to the index passed. For example: maior_primo_menorque[100]
returns 97 when completed.
How to know if the value is not filled in? Simple, if null. If null, how to fill in? Well, let’s go to the memoization algorithm:
def get_maior_primo(n):
if maior_primo_menorque[n]:
return maior_primo_menorque[n]
else:
if numero_eh_primo[n]:
maior_primo_menorque[n] = n
return n
else:
maior_primo_menorque[n] = get_maior_primo(n - 1)
return maior_primo_menorque[n]
The Sieve of Eratosthenes
So far I only said that I used the Sieve, but at no time I talked about how this Sieve works. Come on.
The Sieve begins with a vector with n + 1
positions, all set as a priori truth (ie, are possible primes). So, we disregard the numbers 0 and 1, marking them as false. After that, we go through the vector sequentially, until we find an index that is set as truth. At this point, we find a true cousin, so we should mark all its multiples as false; after marking the multiples as false, we go back to the vector sequentially.
def crivo(n, vetor_crivo_inicializado_como_true):
vetor = vetor_crivo_inicializado_como_true
vetor[0] = False
vetor[1] = False
for i in range(2, n + 1):
if vetor[i]:
# achou um primo, vamos marcar todos os múltiplos relevantes
marcar_multiplos(n, vetor, i)
def marcar_multiplos(n, vetor_crivo, primo):
for i in range(primo*primo, n + 1, primo):
vetor_crivo[i] = False
Note the optimization of starting to mark multiples from the square of the prime: every multiple of this prime with another value has already been marked in a previous step. For example, with prime found 5, I have already marked the numbers 10 (multiple of 2), 15 (multiple of 3) and 20 (multiple of 2), the first composite number being the square of 5, 25.
The @LINQ gave me the tip to do a MCVE, I’ll send one as soon as possible
Thanks for the help.
– Roberto
Have you heard of Sieve of Erastothetes?
– Jefferson Quesado