To do this you will need to use the groupby
by the columns Region of Origin and Instruction Degree, use command size
to take the size of each of these groups. After this, it is possible to remove the data on Instruction level index with the command unstack
for them if "transform" into columns, in this way:
df2 = df.groupby(['Região de Procedência', 'Grau de Instrução']).size().unstack(1)
df2.head()
Grau de Instrução ensino fundamental ensino médio superior
Região de Procedência
capital 4 5 2
interior 3 7 2
outra 5 6 2
In order to calculate the total values, the sum of each column with sum
, and save this data in the index "Total", and then repeat the same function but adding line by line with sum(axis= 1)
to create a new column.
df2.loc['Total',:]= df2.sum(axis=0)
df2.loc[:,'Total'] = df2.sum(axis=1)
df2.head()
Grau de Instrução ensino fundamental ensino médio superior Total
Região de Procedência
capital 4.0 5.0 2.0 11.0
interior 3.0 7.0 2.0 12.0
outra 5.0 6.0 2.0 13.0
Total 12.0 18.0 6.0 36.0
what are these numbers you seek?
– Terry
A cross-visualization of frequencies... is a concise way of visualizing data... appears in all lovro of statistics and I wanted to know how to do with pandas
– Gustavo Oliveira
Sorry, I guess I didn’t make myself clear in my previous comment. Can you describe in detail how those numbers are filled in? Which column do they come from? What happens when the "Capital" value repeats? is the numbers of each column summed? is the average calculated?
– Terry
ah yes excuse me...these numbers are the frequencies ... for example in the column Instruction Degree we have qualitative data, if we make the sum we will have 12 E Fundamental, 18 and High School and 6 Higher Education... and these data are related to the region... the table of the book I posted shows for example that of the cases of Elementary School, 4 belong to the Capital... What I don’t know is to cross-reference the information from these two columns... thank you very much for your interest in helping me Terry...
– Gustavo Oliveira