Select lines by first digits, and average them per year

Question

Select lines by first digits, and average them per year

Asked 6 years, 9 months ago

Viewed 41 times

-1

I have a CSV of the following type:

code   year   sales
2011   1970   5000
2011   1971   5200
2011   1972   ...
...   
2015   1970
2015   1971
2015   1972
...
3025
...
3026
...
3052
...

How can I select all lines from code starting with '20', or '30', and averaging Sales for each year (year)?

Thank you very much!!

1 answer

Browser other questions tagged python pandas

You are not signed in. Login or sign up in order to post.

by nosklo • **5,801** points · Answer 1 · 2018-11-10T16:05:16+00:00

import csv, collections

soma = collections.defaultdict(float)
qtd = collections.defaultdict(int)

with open('arquivo.csv', newline='') as f:
    cf = csv.DictReader(f, delimiter='\t')
    for reg in cf:
        chave = (reg['code'], reg['year'])
        soma[chave] += float(reg['sales'])
        qtd[chave] += 1

for code, year in sorted(soma):
    print("A media para o code {} no ano {} é: {}".format(
         code, year, soma[code, year] / qtd[code, year]
    )