If you can (and I don’t see why you couldn’t) use generators, you can instead store the entire list with string and index, to control which have already been returned, store only the string which is your reference. The logic basically is that for each item in your entry list, you check whether the string has already been returned and, if not, returns the array with string and index; if it has already been returned just ignore.
def remove_duplicates(sequences):
returned = []
for sequence in sequences:
if sequence[0] not in returned:
yield sequence
returned.append(sequence[0])
So just do:
nao_repetidos = list(remove_duplicates(lista))
As the return of the function is a generator, just use list() to get as a list.
Some optimizations can be done, such as doing returned, which is a list, to be a set (set), improving the search of the elements; and if the entry list is already classified as to the string, you do not need to keep in memory all the strings returned, just check if the current within the repeat loop is equal to the last string returned by function.
A solution without generators could be, assuming a classified input list as to the string:
def remove_duplicates(sequences):
last_string = None
result = []
for sequence in sequences:
if sequence[0] != last_string:
result.append(sequence)
last_string = sequence[0]
return result
Which is a particularly interesting solution for being O(n), going through the list only once and not relying on searches on intermediate lists.
The index of the first occurrence of the value must be preserved?
– Woss
That question is answered here.
– TryAgain
Hi Anderson. Yes I need to preserve the indexes!
– Pingam
@Tryagain, the cited problem function works only for simple lists. When I play my list. He returns the same thing to me because of the ratings. It sees the input and sees that it is a different value, and does not remove anything, even though the string has the same value. :/
– Pingam