How to make a frequency distribution table in Python?

Asked

Viewed 4,407 times

1

Good afternoon,

One question: Could someone enlighten me on how I can make a frequency distribution table: classes; absolute and relative frequency; cumulative form; average values of each class.

  • 1

    Your question is too general! You haven’t shown the data you have (at least a small sample, to know how to work) and what you’ve done and how you’re trying to implement it. Almost everything you want is possible to do with numpy/scipy functions.

1 answer

2

Hi, I managed to create a... I’ll send you the example using Pandas and maybe even help you!

Calculations needed to generate the table: Class Width (h) through the relationship h=AT/k, in which AT=max(x) min(x) is the total breadth of the data and k = root(n) is an estimated number of class ranges for a data set with n observations (k can be calculated by other definitions such as the Sturges rule, for example).

Table Creation - Suppose you will also use a Dataframe pandas

1 - Sorting of dataframe values

df = data['fixed acidity']
df.sort_values(ascending=True)

2 - Calculate the Total Data Amplitude

# Amplitude dos dados = Valor maior dos registros - menor valor
at = df.max() - df.min()

3 - Calculate Class Amplitude

  # Lembrando que k = raiz quadrada do total de registros/amostras
    k = math.sqrt(len(df))
    # O valor de amplitude de classe pode ser arredondado para um número inteiro, geralmente para facilitar a interpretação da tabela.
    h = at/k 
    h = math.ceil(h)

4 - Generate frequency table

frequencias = []

# Menor valor da série
menor = round(df.min(),1)

# Menor valor somado a amplitude
menor_amp = round(menor+h,1)

valor = menor
while valor < df.max():
    frequencias.append('{} - {}'.format(round(valor,1),round(valor+h,1)))
    valor += h

5 - Frequency distribution:

freq_abs = pd.qcut(df,len(frequencias),labels=frequencias) # Discretização dos valores em k faixas, rotuladas pela lista criada anteriormente
print(pd.value_counts(freq_abs))

Reference of calculations and some examples used: https://www.inf.ufsc.br/~Andre.zibetti/probability/Aed.html#Vari%C3%A1vel_quantitativa_discrete

  • If anyone knows an easier/better way to do it can teach me please, it was the best way I could find :)

  • Due to the question being very broad, I can’t confirm if this really meets the requested. However, it was a great explanation Gabi , thanks for sharing!

  • You’re welcome! I thought it would be

Browser other questions tagged

You are not signed in. Login or sign up in order to post.