how to find the average in a table of classes

Asked

Viewed 88 times

1

tabela de classes para número de divorcios

I wonder how do I find the average in the frequency table using R. I tried doing:

sum(n.div * p.m)/5000

But the result is very discrepant (6900). When I do in any other table, the same procedure is right. The variable p.m is the midpoint of the class and var n.div is the frequency and Fac the cumulative frequency. Follows code:

anos.casamen <- c("0|--6","06|--12","12|--18","18|--24","24|--30")

n.div <- c(2800,1400,600,150,050)

cartorio <- data.frame(anos.casamen,n.div)

cartorio["Fac"]<- cumsum(n.div)

cartorio["porcentagem"]<- round(prop.table(n.div),digits = 4)

medio <- sum(cartorio$n.div * cartorio$p.m)/length(n.div)
  • 2

    Would not be sum(cartorio$n.div * cartorio$p.m)/cartorio$Fac[length(n.div)]? Or more succinctly: with(cartorio, sum(n.div*p.m)/sum(n.div))

  • 1

    @Carloseduardolagosta This should be an answer. Another may be with(cartorio, weighted.mean(p.m, porcentagem)).

  • yes, actually from 6.9 but I don’t understand how this number helps me understand the average within the frequency that is n.div ( number of divorces) can be 6.9 which is an extremely small valoe or 6900 which passes the maximum

  • 1

    It is not a small value, there are 4200 values between 0 and 12 but only 800 between 12 and 30. This value helps to understand that the data distribution is asymmetrical.

  • yes, I understood, thank you very much, the explanation of the asymmetry made me understand, even for not being a linear average.

1 answer

4


One way to calculate weighted averages is with weighted.mean.

with(cartorio, weighted.mean(p.m, porcentagem))
#[1] 6.9

Gives the same result as the solution in commenting user’s Carlos Eduardo Lagosta.

In one comment it is said that the weighted average value is very small. In fact it is not small considering that the distribution of the data is asymmetrical:

  1. there are 4200 values between 0 and 12
  2. but only 800 between 12 and 30.

This mean value helps to understand that the data distribution is asymmetrical. This can also be seen graphically.

with(cartorio, barplot(setNames(n.div, anos.casamen)))

inserir a descrição da imagem aqui

Complete data

anos.casamen <- c("0|--6","06|--12","12|--18","18|--24","24|--30")
n.div <- c(2800,1400,600,150,050)
cartorio <- data.frame(anos.casamen,n.div)

cartorio$Fac <- cumsum(cartorio$n.div)
cartorio$porcentagem <- c(0.56, 0.28, 0.12, 0.03, 0.01)
cartorio$p.m <- c(3, 9, 15, 21, 27)

cartorio 
#  anos.casamen n.div  Fac porcentagem p.m
#1        0|--6  2800 2800        0.56   3
#2      06|--12  1400 4200        0.28   9
#3      12|--18   600 4800        0.12  15
#4      18|--24   150 4950        0.03  21
#5      24|--30    50 5000        0.01  27
  • yes, I just don’t understand the result because it gives 6.9, when I do in other tables comes out a result between the numbers in the frequency column, and 6.9 is very small and 6900 passes the valoe max

  • 1

    @Di82rquant Not a small number, see the edition.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.