Select lines by first digits, and average them per year

Asked

Viewed 41 times

-1

I have a CSV of the following type:

code   year   sales
2011   1970   5000
2011   1971   5200
2011   1972   ...
...   
2015   1970
2015   1971
2015   1972
...
3025
...
3026
...
3052
...

How can I select all lines from code starting with '20', or '30', and averaging Sales for each year (year)?

Thank you very much!!

1 answer

0


import csv, collections

soma = collections.defaultdict(float)
qtd = collections.defaultdict(int)

with open('arquivo.csv', newline='') as f:
    cf = csv.DictReader(f, delimiter='\t')
    for reg in cf:
        chave = (reg['code'], reg['year'])
        soma[chave] += float(reg['sales'])
        qtd[chave] += 1

for code, year in sorted(soma):
    print("A media para o code {} no ano {} é: {}".format(
         code, year, soma[code, year] / qtd[code, year]
    )
  • I was not very clear, sorry. I want to take the average of all code starting with '20' (i.e. 2011 2015...) each year. In 1970 for each 'code' an average, and so on for all years. Thanks for the help!

  • @R.Moura if your question is not good, edit the question and put more information, input examples, example of how you want it to end - type would be useful to those who are responding if you put about 10 lines and an example of what the result would look like if your file were only those 10 lines.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.