To perform a count of code_client
of your records using python you can use:
- the class
collections.Counter
- a common dictionary
- the class
collections.default_dict
- ...some other solution I don’t know...
For the following examples, I will use a sequence of tuples to simulate the return of a SELECT
in the database... Fictitious data are:
dados = (
('BRA', 'BRASIL_ALIMEN', 'SAO_PAULO'),
('BRA', 'BRASIL_CARROS', 'PARANA'),
('BRA', 'BRASIL_NAVIOS', 'PARAIBA'),
('CAN', 'CANADA_ALIMEN', 'ALBERTA'),
('USA', 'USA_CARROS', 'MASSACHUSSETS'),
('USA', 'USA_NAVIOS', 'CALIFORNIA'),
('UK', 'UK_NAVIOS', 'YORK'),
)
In the following examples, I will count the occurrence of the first element of tuples, as BRA
, CAN
, etc...
Necessary knowledge
In the examples of the answers I use list comprehensions* and iterable unpacking (see the PEP 3132 for more information).
But for the sake of clarity, here is a brief demonstration of how I use them in the answers below:
# cria uma lista normal
lista = [1, 2, 3, 4, 5]
# Usa uma list comprehension para criar outra lista
list_comprehension = [-x for x in lista]
# list_comprehension = [-1, -2, -3, -4, -5]
# Usa iterable unpacking para "quebrar a lista em pedaços"
um, dois, *restante = lista
# um = 1
# dois = 2
# restante = (3, 4, 5)
With this information, I believe the following codes will be no problem.
* is actually a Generator Expressions, but it will be much easier to understand Generator Expressions, Dict comprehensions and its variations, if you understand list comprehensions.
1. Using collections.Counter
(#Docs)
The class Counter
is a subclass of dict
, the standard python dictionary, which serves as a counter for objects hashable.
We can create a counter from any iterable, such as a list, tuple or, for example, a string:
from collections import Counter
contador = Counter("abracadabra")
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
That way we can make it Counter
count only the first elements of a tuple with the following code:
from collections import Counter
contador = Counter(cod_cliente for cod_cliente, *_ in dados)
# contador = Counter({'BRA': 3, 'USA': 2, 'CAN': 1, 'UK': 1})
Remembering that:
[cod_cliente for cod_cliente, *_ in dados]
# ['BRA', 'BRA', 'BRA', 'CAN', 'USA', 'USA', 'UK']
2. Using dict
(#Docs)
We can use a common dictionary to add the sum of occurrences as we iterate over dados
.
For this we only have to deal with when the key does not yet exist in the dictionary, because if we try to get a non-existent key, dict.__getitem__
invoke dict.__missing__
who will make the exception KeyError
. Examples:
dicionario = {'teste': 10}
# 1) Atualiza uma chave existente (OK)
dicionario['teste'] = dicionario['teste'] + 1
# dicionario = {'teste': 11}
# 2) Atualiza uma chave inexistente (erro)
dicionario["outra-chave"] = dicionario["outra-chave"] + 1
# ^^^^^^^^^^^^^^^^^^^^^^^^^
# KeyError: 'outra-chave' não existe em 'dicionario'
# 3) Testando antes de usar a chave (OK)
if 'outra-chave' not in dicionario:
dicionario["outra-chave"] = 0
dicionario["outra-chave"] += 1
# dicionario = {'teste': 11, 'outra-chave': 1}
You could also use the exception KeyError
to treat these cases. Example:
dicionario = {}
try:
dicionario['teste'] += 1
except KeyError:
dicionario['teste'] = 1
However, dictionaries have the method get
who receives the arguments dict.get(key, default)
, where key
is the key to the dictionary you want to read and default
is the value that will be returned if this key does not exist.
In our case, we want to add 1 unit to the current value of the key, but if the key does not exist we want this value to be 0. See in practice:
dicionario = {}
dicionario['teste'] += 1
# KeyError
dicionario['teste'] = dicionario['teste'] + 1
# KeyError
dicionario['teste'] = dicionario.get('teste', 0) + 1
# dicionario = {'teste': 1}
Thus, if the key does not yet exist, it creates the new key with the appropriate value.
The final code would be:
contador = {}
for code_client, *_ in dados:
contador[code_client] = contador.get(code_client, 0) + 1
# contador = {'BRA': 3, 'CAN': 1, 'USA': 2, 'UK': 1}
3. Using defaultdict
(#Docs)
Just like the collections.Counter
mentioning earlier, default_dict
is also a subclass of dict
.
The class defaultdict
has an attribute default_factory
which must be an enforceable object or None
.
By default, when accessing a nonexistent key, the method dict.__getitem__
invokes the method dict.__missing__
, and this throws an exception KeyError
.
Already the defaultdict
override the method dict.__missing__
to invoke defaultdict.default_factory
and use your return as default value if the key does not exist.
A summary:
dicionario = {}
valor = dicionario['teste']
# 1. invoca dicionario.__getitem__('teste')
# 2. chave não existe, então invoca dicionario.__missing__('teste')
# 3. dict.__missing__ lança uma KeyError
# KeyError
Now defaultdict
:
from collections import defaultdict
# função que será a 'default_factory' do defaultdict
def valor_padrao():
return "Valor padrão"
dicionario = defaultdict(valor_padrao)
valor = dicionario['teste']
# 1. invoca dicionario.__getitem__('teste')
# 2. chave não existe, então invoca dicionario.__missing__('teste')
# 3. defaultdict.default_factory é um objeto invocável, então retorna o resultado do método
# 4. dicionario['teste'] = dicionario.default_factory()
# 5. valor = dicionario['teste']
# valor = 'Valor padrão'
If defaultdict.default_factory
for None
, defaultdict
behaves in the same way that dict
and will launch a KeyError
in non-existent keys.
For our final code, just create a function that returns zero and use it as default_factory
. For our convenience the function int
, if invoked without parameters, returns zero. Then the final code using defaultdict
would be:
contador = defaultdict(int)
for code_client, *_ in dados:
contador[code_client] += 1
# contador = defaultdict(<class 'int'>, {'BRA': 3, 'CAN': 1, 'USA': 2, 'UK': 1})
These are 3 ways you can count the repetitions you receive from your database, but remember for future visitors you use GROUP BY
and COUNT
in his query is much more performatic.
I created this Repl.it with the 3 examples running, if it is in someone’s interest.
He even used a
count
orif
to count?– Edward Ramos
That’s the only way I could do it: print(Len(lds_data.Keys()))
– Luis Henrique
Take a look at this documentation: https://docs.python.org/2/library/collections.html#Collections. Counter
– Vinicius Bussola
for item in lds_data: clients = Counter(lds_data[item]['code_client']) print(clients) Ja tested, but to no avail...
– Luis Henrique
If you understand English, take a look at this question: https://stackoverflow.com/questions/17705829/count-repeated-keys-in-a-dict
– Vinicius Bussola
I’ll take a look, thanks!
– Luis Henrique
Luis, for your last questions here on the site I think you need to study SQL. You can take the replay count directly from the database, it’s much more performatic. Unless you’re learning python and you don’t want to use SQL on purpose.
– fernandosavio
Unfortunately I can not use SQL, I have to do in python in the back-end... It would be much faster to do straight in the comic book, but unfortunately I can not, but thanks for the tip!
– Luis Henrique
But where does the
db_result
?– fernandosavio
A DB2 database
– Luis Henrique
That’s not what I meant... the
db_result
is the result of a query made by python in a database... You cannot modify this query?– fernandosavio
Ah ok, sorry, n had understood, I can, but n I can perform a different query than the one I was given, I can only work on the script
– Luis Henrique
@Luisv. answered the question with 3 ways to make the occurrence count in eternal. Any question is just ask.
– fernandosavio