1
Good afternoon,
One question: Could someone enlighten me on how I can make a frequency distribution table: classes; absolute and relative frequency; cumulative form; average values of each class.
1
Good afternoon,
One question: Could someone enlighten me on how I can make a frequency distribution table: classes; absolute and relative frequency; cumulative form; average values of each class.
2
Hi, I managed to create a... I’ll send you the example using Pandas and maybe even help you!
Calculations needed to generate the table: Class Width (h) through the relationship h=AT/k, in which AT=max(x) min(x) is the total breadth of the data and k = root(n) is an estimated number of class ranges for a data set with n observations (k can be calculated by other definitions such as the Sturges rule, for example).
Table Creation - Suppose you will also use a Dataframe pandas
1 - Sorting of dataframe values
df = data['fixed acidity']
df.sort_values(ascending=True)
2 - Calculate the Total Data Amplitude
# Amplitude dos dados = Valor maior dos registros - menor valor
at = df.max() - df.min()
3 - Calculate Class Amplitude
# Lembrando que k = raiz quadrada do total de registros/amostras
k = math.sqrt(len(df))
# O valor de amplitude de classe pode ser arredondado para um número inteiro, geralmente para facilitar a interpretação da tabela.
h = at/k
h = math.ceil(h)
4 - Generate frequency table
frequencias = []
# Menor valor da série
menor = round(df.min(),1)
# Menor valor somado a amplitude
menor_amp = round(menor+h,1)
valor = menor
while valor < df.max():
frequencias.append('{} - {}'.format(round(valor,1),round(valor+h,1)))
valor += h
5 - Frequency distribution:
freq_abs = pd.qcut(df,len(frequencias),labels=frequencias) # Discretização dos valores em k faixas, rotuladas pela lista criada anteriormente
print(pd.value_counts(freq_abs))
Reference of calculations and some examples used: https://www.inf.ufsc.br/~Andre.zibetti/probability/Aed.html#Vari%C3%A1vel_quantitativa_discrete
If anyone knows an easier/better way to do it can teach me please, it was the best way I could find :)
Due to the question being very broad, I can’t confirm if this really meets the requested. However, it was a great explanation Gabi , thanks for sharing!
You’re welcome! I thought it would be
Browser other questions tagged python pandas numpy
You are not signed in. Login or sign up in order to post.
Your question is too general! You haven’t shown the data you have (at least a small sample, to know how to work) and what you’ve done and how you’re trying to implement it. Almost everything you want is possible to do with numpy/scipy functions.
– Guto