How to count repeated elements in a list of tuples?

Asked

Viewed 586 times

-1

listaAnimais=[('leao', 'Simba'), ('javali', 'Pumba'), ('leao', 'Scar'), ('hiena', 'Banzai'), ('leao', 'Mufasa')] #especie,nome
dicEspecies={}

for especies,nome in listaAnimais:
    dicEspecies[(especies)]=(listaAnimais.count(especies))

print(dicEspecies)

Result obtained:

{'leao': 0, 'hiena': 0, 'javali': 0}

Desired result:

{'leao': 3, 'hiena': 1, 'javali': 1}
  • Welcome to Sopt. I edited your question so that it meets the standards of the site. To improve your future questions, please read this manual: https://pt.meta.stackoverflow.com/questions/5483/manual-de-como-n%C3%83o-ask-questions? cb=1

  • If one of the answers below solved your problem and there was no doubt left, choose the one you liked the most and mark it as correct/accepted by clicking on the " " that is next to it, which also marks your question as solved. If you still have any questions or would like further clarification, feel free to comment.

2 answers

2

You don’t need to use the entire list within the for loop since you only need the first element of the tuples. A solution would be to create a sub-list from the first and only then iterate. See:

listaAnimais=[('leao', 'Simba'), ('javali', 'Pumba'), ('leao', 'Scar'), ('hiena', 'Banzai'), ('leao', 'Mufasa')] #especie,nome
lista_especies = [t[0] for t in listaAnimais]
dicEspecies={}

for especie in lista_especies:
    dicEspecies[(especie)]= lista_especies.count(especie)

print(dicEspecies)

Output:

{'leao': 3, 'javali': 1, 'hiena': 1}

1

The easiest way is to use one Counter, standard library class collections.

It acts as a dictionary and can be initialized with a list, or in this case a generating expression:

from collections import Counter

# especie,nome
listaAnimais = [
    ('leao', 'Simba'),
    ('javali', 'Pumba'),
    ('leao', 'Scar'),
    ('hiena', 'Banzai'),
    ('leao', 'Mufasa'),
]

# Criar um contador, o inicializando com a primeira string
# de cada elemento em `listaAnimais`
dicEspecies = Counter(animal[0] for animal in listaAnimais)

print(dicEspecies)
# Counter({'leao': 3, 'javali': 1, 'hiena': 1})

Behind the scenes, this solution is efficient because it runs in time linearly proportional to the size of the list, and not quadratically how to use count within a loop for.

In the simplest way possible, she does the equivalent of this:

# Inicializamos um dicionário vazio
dicEspecies = {}

# Para cada animal...
for especie, nome in listaAnimais:

    # Se a espécie ainda não estiver no nosso
    # dicionário, a inicializamos com o valor de 0
    if especie not in dicEspecies:
        dicEspecies[especie] = 0

    # Em qualquer caso, somamos um à contagem
    # dessa espécie
    dicEspecies[especie] += 1

print(dicEspecies)
# {'leao': 3, 'javali': 1, 'hiena': 1}

About performance: it is a trivial observation for a list of this size, but if the list were large, we would have to be careful with the complexity of the algorithm used for counting. That answer has interesting content on the complexity of algorithms and how to analyze it. That answer is also very good.

Basically, the approach using count inside the loop for works in quadratic time O(N 2): that is, as the size of the list N increases, the execution time of the function increases proportionally to N 2.

This is because

  • The algorithm iterates under all elements of the list with the for
  • For every element, he calls count: this function, internally, again traverses the entire list to count how many elements are present.

Already using a dictionary or Counter, we only go through the list once on for. The two, internally, are a data structure called a hashmap (hashmap), which has the property of having constant-time access and belonging verification; that is, it is independent of the list size N.

In this case, as the list size N grows, the function runtime increases proportionally to N, because we only go through the list once, not N times.

  • Very interesting your answer. How do I know these performance details? Using your example, how to know if a function runs linear or quadratically?

  • 1

    @Hartnäckig is basically a study of algorithm complexity - nothing important for a list that size, but it’s worth keeping in mind if you’re dealing with much larger lists. I updated the answer to explain a little better!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.