The other answers already explained the problem of your algorithm, but just to give a few more solution options, an alternative is to use a Counter
:
from collections import Counter
def verificar_lista(lista):
return len(Counter(lista)) != len(lista)
The Counter
is a dictionary that contains, for each element of the list, the amount of times it occurs. So if his number of keys equals the size of the list, it’s because no element repeats. But in this particular case I confess that I find it an exaggeration to use Counter
(we’ll see why in the end), and would only be recommended if you actually need the amounts of times each element occurs.
Another option is to use set
and gradually add elements to it, until you find one that is already:
def verificar_lista(lista):
s = set()
return any(x in s or s.add(x) for x in lista)
The idea is to check if the element is already in the set
, and if not, add. If you find an element that is already, any
returns True
, indicating that the element is repeated.
Now let’s compare the performance of these solutions with the other answers. By doing a quick test with the module timeit
:
from collections import Counter
def set1(lista): # solução de outra resposta, com set
return len(set(lista)) != len(lista)
def set2(lista): # set com any
s = set()
return any(x in s or s.add(x) for x in lista)
def counter(lista): # collections.Counter
return len(Counter(lista)) != len(lista)
def sublista(lista): # solução de outra resposta, com sub-listas
for posicao, numero in enumerate(lista):
if numero in lista[posicao+1:]:
return True
else:
return False
from timeit import timeit
listas = [ [1, 9, 2, 4, 3, 3, 2], [1, 2, 3, 4, 5, 6, 7] ]
funcoes = ['set1', 'set2', 'counter', 'sublista']
# executa 1 milhão de vezes cada teste
for func in funcoes:
print(f'{func:>8}', timeit(f'for lista in listas:\n\t{func}(lista)', number=1000000, globals=globals()))
The times may vary from one machine to another, but in mine the result was (times in seconds):
set1 1.2527604
set2 3.2695244
counter 5.8885616
sublista 2.962209399999999
That is, the solution with set
of another answer was faster, followed by the sub-list solution. Afterward, set
with any
was the third and Counter
was the worst. In Ideone.com and in the Repl.it times were different, but the result was similar (Obs: no Ideone.com had to decrease the amount of times because it was breaking the time limit).
Of course, for a few small lists the difference will be insignificant. And the result also depends on the lists. For example, I did another test with this list:
[1, 1] + list(range(1000))
That is, two elements repeated at the beginning and a thousand more numbers (from 0 to 999). The result was:
set1 18.3008393
set2 0.9740259000000009
counter 45.1233362
sublista 2.3400793999999934
In this case, the alternative solution with set
(the one you use any
) was faster, because it detects right in the first iterations that the element is repeated (the any
interrupts the loop as soon as it finds a repeat, then it only iterates through the first 2 numbers).
The second fastest solution was with sub-lists, also because it closes the for
as soon as it finds the repeated number (the delay in relation to the first solution may be due to having to build the sub-list with 999 elements).
Already the solution with set
of the other answer took longer because it needs to build all the set
first, only then to catch the size.
And the solution with Counter
was the worst of all (much worse, by the way), because she had to count the amount of all 1000 elements (and this confirms that using Counter
in this case is really an exaggeration).
You’re going through the list values and literally checking if they’re on the list... they always will be, so the feedback is always
True
.– Woss