With the very Sieve of Eratosthenes I achieved very satisfactory results. It is worth remembering that, as the sieve returns the list of all primes smaller than the input value, we will have to inform the highest prime value that the program will accept. The statement states that it is guaranteed that no number will be requested after the 200,000°, so initially we must know what this value is: 2.750.159.
A very simple implementation of the sieve is:
def sieve_of_eratosthene(N):
# Cria-se uma lista referente a todos os inteiros entre 0 e N:
A = [True] * (N+1)
# Define os números 0 e 1 como não primos:
A[0] = A[1] = False
# Percorra a lista até encontrar o primeiro número primo:
for value, prime in enumerate(A):
# O número é primo?
if prime:
# Retorna o número primo:
yield value
# Remova da lista todos os múltiplos do número enontrado:
for i in range(value**2, N+1, value):
A[i] = False
So we can get all 200,000 first prime numbers by making:
primes = list(sieve_of_eratosthene(2750159))
And generate the output as follows:
print("Saída:", [primes[i] for i in [7, 1, 199999, 4]]) # Saída: [19, 3, 2750159, 11]
Being [7, 1, 199999, 4]
the entry of the program.
Note: 199.999° prime number will be 2.750.131, as expected, only if number 2 occupies index 1. Whereas the implementation sets index 0 for value 2, the expected output value would refer to index 199.998°.
Using the module timeit
for the measurement of time, I obtained an average time, in 100 executions, of 0.69539769177063s
.
See working on Ideone.
A brief explanation of the code
The first line of the function, A = [True] * (N+1)
creates a list of N+1
elements, all defined as True
, initially indicating that all values between 0 and N are prime numbers. Soon after, the values 0 and 1 are defined as False
, indicating that these are not prime numbers. And with a repeat loop, all other values are traversed and whenever you find a prime value, return it and delete it from the list, defining as False
, all numbers that are multiples of that prime found. For a better view, the list A
would be something like:
A = [False, False, True, True, False, True, False, ...]
Indicating that 0 is not prime, 1 is not prime, 2 is prime, 3 is prime, 4 is not prime, 5 is prime, 6 is not prime, so on. What the function enumerate
does is to return a pair of values where the first represents the index in the list and the second the value itself. Thus, doing enumerate(A)
, would be returned something similar to:
[(0, False), (1, False), (2, True), (3, True), (4, False), (5, True), (6, False), ...]
And that’s why in for
there are two values. The first value returned from enumerate
is assigned to the value
and the second to prime
, so when prime
is true, we know that value
will be a prime number.
for value, prime in enumerate(A):
...
Already the yield
plays the role of return
, however, for a generator. I confess that I ended up complicating more than necessary in this code using a generator, because since the generator should be converted to a list, I could generate it directly. The code would look like this:
def sieve_of_eratosthene(N):
# Lista de números primos:
numbers = []
# Cria-se uma lista referente a todos os inteiros entre 0 e N:
A = [True] * (N+1)
# Define os números 0 e 1 como não primos:
A[0] = A[1] = False
# Percorra a lista até encontrar o primeiro número primo:
for value, prime in enumerate(A):
# O número é primo?
if prime:
# Retorna o número primo:
numbers.append(value)
# Remova da lista todos os múltiplos do número enontrado:
for i in range(value**2, N+1, value):
A[i] = False
return numbers
See working on Ideone.
Note that the main difference is that where before I used the yield
, now used the append
from a list.
An extra implementation
As commented by LINQ in its reply, the fact of having to know beforehand the n-nth prime value can be considered a limiting of the solution. A practical alternative is to apply the mathematical concept presented at the end of this answer to calculate a prime value close to the desired one. Knowing that the n-nth prime number, Pn, is less than n ln(n) + n ln(ln(n))
, if we calculate all the prime values up to this value we are sure we have calculated the Pn value. Something like:
def sieve_of_eratosthene(N):
N = floor(N*log(N) + N*(log(log(N))))
# Lista de números primos:
numbers = []
# Cria-se uma lista referente a todos os inteiros entre 0 e N:
A = [True] * (N+1)
# Define os números 0 e 1 como não primos:
A[0] = A[1] = False
# Percorra a lista até encontrar o primeiro número primo:
for value, prime in enumerate(A):
# O número é primo?
if prime:
# Retorna o número primo:
numbers.append(value)
# Remova da lista todos os múltiplos do número enontrado:
for i in range(value**2, N+1, value):
A[i] = False
return numbers
See working on Ideone.
Just by adding the first line, we can now call the function sieve_of_eratosthene(2e5)
to get the first 200,000 prime numbers. In fact, you get even more, so the running time can increase.
Possible alternative
It is proved mathematically, if I understand correctly, that the n-nth prime number, being n greater than or equal to 6, shall belong to the set of numbers defined by:
For example, for n = 199999
, the following set is obtained:
2741586 < P_n < 2941585
That contains the expected value 2750159; however, this will not be the only prime number in this range, so the challenge following this logic would be to correctly identify the value within the range.
Good question. I managed to do it in 12 seconds on the Chrome console myself, I’d like to know how to do it faster.
– Oralista de Sistemas
The question is one of the best made of the site, the title not so much. This time requirement is complicated.
– Maniero
@Bigown improved the title, leaving the requirement of less rigorous time.
– santosmarco_
Little brother, I have an example in Java that produces cousins very quickly, maybe I can help you link: https://github.com/HallefBruno/Estrutura-dados/blob/master/primo.java
– Bruno
So, it doesn’t say you can do it in 1 second in Python. It might take 3 times, or 10 times just because it’s Python. I’m not saying it does, but it might be. Ali makes a naive statement. There’s a way to make it go faster. Does something that distributes the calculation on 1000 machines or at least colors. When they wrote this was a time, the machines evolved. Imagine that written in the 90s today would give with a horrible algorithm.
– Maniero
One of the most complex problems there is is generating primes, the problem gets worse the more numbers you want. There are complex algorithms that can be improved, I haven’t even seen your code, but you don’t have to check everything. I didn’t even choose to mess with not generating the cousins but to provide them :) See https://en.wikipedia.org/wiki/Sieve_of_Atkin
– Maniero