0
I need to get the same result of the media, fashion and average of this table
Televisores/dia Freq. absoluta
0 |----- 20 5
20|----- 40 25
40|----- 60 40
60|----- 80 15
80|----- 100 10
100|----- 120 5
media=53 moda=50 mediana=50
The idea is to calculate the average of each value in the first column and then the frequency of each one. I arrived at this result:
televisores = [*range(0, 120)]
frequencia = [5, 25, 40, 15, 10, 5]
df = pd.DataFrame({'televisores': televisores})
bins = pd.cut(df['televisores'], [0, 20, 40, 60, 80, 100, 120])
df = df.groupby(bins)['televisores'].agg(Media='mean')
df['Freq. absoluta'] = frequencia
count = [x for x,y in zip(df['Media'], df['Freq. absoluta']) for i in range(y)]
The problem is that the media returns the values with 0.5 more
df
Media Freq. absoluta
televisores
(0, 20] 10.5 5
(20, 40] 30.5 25
(40, 60] 50.5 40
(60, 80] 70.5 15
(80, 100] 90.5 10
(100, 120] 110.0 5
mean(count), mode(count), median(count)
53.475 50.5 50.5
I wanted to understand the problem and know if there is any easier way to get the result.
Tried to use
include_lowest=True
no cut? Regardless. I believe your result is right, since you used the cut.– Paulo Marques