Quantity in a dataset

Asked

Viewed 94 times

-1

I own this dataset inserir a descrição da imagem aqui

I would like to count the instances of 'DATA_CRIA' in relation to this other dataset inserir a descrição da imagem aqui

For that I performed this command, only it did not work:

lista = [] for ano in anos: lista.append(oni['DATA_CRIA'].count() print(lista)

  • Please do the [tour] and read the [Ask] guide. It’s not exactly clear what you want to do and try to avoid posting code as an image. Many users can’t see images, worsens the experience of browsing the mobile app and harms the site search.

1 answer

0


I assumed you are using Pandas and Dataframe.

Starting from that:

import pandas as pd

anos = [2008, 2009, 2010]

d = {'DISTANCIA': [12.3, 33.3, 11.1, 43.4], 'DATA_CRIA': [2008, 2008, 2009, 1909]}
df = pd.DataFrame(data=d)
>>> print df
   DATA_CRIA  DISTANCIA
0       2008       12.3
1       2008       33.3
2       2009       11.1
3       1909       43.4

We can create a function that counts the occurrence of a given year a dataframe:

def conta_ocorrencias(df, ano):
    ocorrencias = 0
    for i in range(len(df)):
        if df.iloc[i]['DATA_CRIA'] == ano:
            ocorrencias += 1
    return ocorrencias

And then we do:

lista = []
for ano in anos:
    lista.append(conta_ocorrencias(df, ano))
>>> print lista
[2, 1, 0]

However this way is not very efficient, because it runs the df once every year, so we can change the approach.

#crio um dicionario com os anos como chave para contar as ocorrencias
dic_ano = {}
for ano in anos:
    dic_ano.update({ano:0})
>>> print dic_ano
{2008: 0, 2009: 0, 2010: 0}

And then I walk the df just once like this:

for i in range(len(df)):
    for ano in anos:
        if df.iloc[i]['DATA_CRIA'] == ano:
            dic_ano[ano] = dic_ano[ano]+1
>>> print dic_ano
{2008: 2, 2009: 1, 2010: 0}
  • Thanks for the tip, but the problem is I need the result of the year plus the sum of the previous years, and I want the result in a list.

  • I made an example manually: listas=[]&#xA;oni_8=gd.loc[gd['DATA_CRIA'] <= 2008]&#xA;depois8= oni_8['FROTA'].sum()&#xA;listas.append(depois8) , what I want is to scan the dataset with this, to create a list with the amounts of each year.

  • The 'FLEET' column has only integers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.