Function to duplicate a Python list

Question

Function to duplicate a Python list

Asked 6 years ago

Viewed 184 times

1

Guys I have this list:

lista = [['10616558', 0],
 ['2856466', 1],
 ['9715350', 2],
 ['9715350', 3],
 ['9715350', 4],
 ['10720706', 5]]

The first element is any string, and the second is an index. I need to do a function that takes elements from the list that contains the same string, preserving the index.

The output would that way:

>>> lista = removeigual(lista)
>>> lista
[['10616558', 0],
 ['2856466', 1],
 ['9715350', 2],
 ['10720706', 5]]

I have a function that removes duplicates but is only for simple lists, but I could not adapt to my problem:

def removeDuplicates(listofElements):

    uniqueList = []

    for elem in listofElements:
        if elem not in uniqueList:
            uniqueList.append(elem)
    return uniqueList

The index of the first occurrence of the value must be preserved?

– Woss

2019/08/01 at 19:24
That question is answered here.

– TryAgain

2019/08/01 at 19:27
Hi Anderson. Yes I need to preserve the indexes!

– Pingam

2019/08/01 at 19:35
@Tryagain, the cited problem function works only for simple lists. When I play my list. He returns the same thing to me because of the ratings. It sees the input and sees that it is a different value, and does not remove anything, even though the string has the same value. :/

– Pingam

2019/08/01 at 19:41

3 answers

1

Your code is almost adapted to your problem, only a few minor changes have been made to the line if elem not in uniqueList:.

The entire initial list is traversed, and when a value is found with a string that has already been used, it is not added to the final list:

def removeDuplicates(listofElements):
  uniqueList = []
  for elem in listofElements:
    if elem[0] not in [i[0] for i in uniqueList]:  # se string ainda não estiver na uniqueList
      uniqueList.append(elem)
  return uniqueList

lista = [['10616558', 0],
         ['2856466', 1],
         ['9715350', 2],
         ['9715350', 3],
         ['9715350', 4],
         ['10720706', 5]]

print(removeDuplicates(lista))

Output:

[['10616558', 0], ['2856466', 1], ['9715350', 2], ['10720706', 5]]

1

Thanks t3m2! Your solution worked also valeeeeu ! :D

– Pingam

2019/08/01 at 22:24

Browser other questions tagged python function list

You are not signed in. Login or sign up in order to post.

by Filipe Gonçalves • **236** points · Answer 1 · 2019-08-01T20:30:23+00:00

The following solution works if the index is a fixed value and not a number from some function being executed within the list.

lista = [['banana',0], ['caju',1], ['banana',2]]

for fruta in lista:
    for checkduplicada in lista:
        if fruta[0] == checkduplicada[0] and fruta[1] != checkduplicada[1]:
            lista.remove(checkduplicada)

print(lista)

output = [['banana', 0], ['caju', 1]]

See if it is possible to apply to your case.

by Woss • **73,416** points · Answer 2 · 2019-08-01T20:41:40+00:00

If you can (and I don’t see why you couldn’t) use generators, you can instead store the entire list with string and index, to control which have already been returned, store only the string which is your reference. The logic basically is that for each item in your entry list, you check whether the string has already been returned and, if not, returns the array with string and index; if it has already been returned just ignore.

def remove_duplicates(sequences):
    returned = []
    for sequence in sequences:
        if sequence[0] not in returned:
            yield sequence
            returned.append(sequence[0])

So just do:

nao_repetidos = list(remove_duplicates(lista))

As the return of the function is a generator, just use list() to get as a list.

Some optimizations can be done, such as doing returned, which is a list, to be a set (set), improving the search of the elements; and if the entry list is already classified as to the string, you do not need to keep in memory all the strings returned, just check if the current within the repeat loop is equal to the last string returned by function.

A solution without generators could be, assuming a classified input list as to the string:

def remove_duplicates(sequences):
    last_string = None
    result = []
    for sequence in sequences:
        if sequence[0] != last_string:
            result.append(sequence)
            last_string = sequence[0]
    return result

Which is a particularly interesting solution for being O(n), going through the list only once and not relying on searches on intermediate lists.