Value_counts - Pandas - Dataframe - Zero Quantity

Asked

Viewed 24 times

1

I have a dataframe with a column of hours and the other kind of relampago.I’m making a value_counts to count the amount of relampagos every hour the day. The value_conts is working, but there are a few hours of the day relampagos, but ai is not appearing in the series. For example I wanted after q I gave a value_counts, I wanted to show a series with the 24 hours and each hour the amount of relampagos, and this appears, but the hours it has not had relampagos, he jumps. But I wanted to show up every hour, and even the ones that didn’t relampago, appeared the zero you know? Because I will need this, but I’m not getting. I tried to use np.arange, sort_index(level=), .fillna(0). But it’s not working. My cell is like this:

quantidade_IC = IC["hor"].value_counts(ascending=True)
quantidade_IC.sort_index()
display(quantidade_IC.sort_index())

Ai the result (hours 11 and 12 do not appear with zero in front :/)

0       510
1       772
2       275
3        50
4        12
5        16
6        41
7       319
8       201
9        25
10       10
13       29
14      138
15      799
16     3619
17     9622
18    10935
19    13851
20    10928
21     6227
22     1500
23      594
  • 2

    makes a dictionary with all hours of the day as key and with 0 value for each one. Then update it with the return of q vc has the number of the rays at a given time. then just print this list.

  • 1

    can provide an example dataset?

  • My dataframe has 72603 Rows 2 Columns

1 answer

1

Based on the above comment:

Importing libraries

import pandas as pd
import random

Auxiliary variables

tamanho = 1000
horas = [0,1,2,3,4,5,6,7,8,9,10,13,14,15,16,17,18,19,20,21,22,23] # sem 11 e 12

Creating Test Dataframe

df = pd.DataFrame({"hora": [random.choice(horas) for _ in range(tamanho)], "relampago": [int(random.random()*10) for _ in range(tamanho)]})

Creating auxiliary dictionary

d = dict(zip(range(24), [0]*24))

print(d)

{0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 0, 18: 0, 19: 0, 20: 0, 21: 0, 22: 0, 23: 0}

Updating auxiliary dictionary

d.update(df.groupby("hora")["hora"].count().to_dict())

print(d)

{0: 54, 1: 43, 2: 48, 3: 52, 4: 52, 5: 41, 6: 50, 7: 49, 8: 39, 9: 47, 10: 45, 11: 0, 12: 0, 13: 53, 14: 38, 15: 46, 16: 52, 17: 45, 18: 42, 19: 35, 20: 48, 21: 52, 22: 32, 23: 37}

Turning dictionary into dataframe

c = pd.DataFrame.from_dict(d, orient='index', columns=["qtd"])

print(c)

    qtd
0    54
1    43
2    48
3    52
4    52
5    41
6    50
7    49
8    39
9    47
10   45
11    0
12    0
13   53
14   38
15   46
16   52
17   45
18   42
19   35
20   48
21   52
22   32
23   37
  • Thank you very much!!! It helped a lot!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.