0
I am new in the world of programming, and I am doing some studies with the aim of gaining knowledge in the area of Data Science.
Come on... I have a Dataframe with a lot of information, among it gender and age. I want to bring the amount of lines of each gender (male and female) and classify them as children (0 to > 12 years), youth (12 to > 18 years) and adults (18+ years).
I’m lost to the point of not even knowing if I got it right...
Input: df.groupby("Sex").Age.unique()
Output:
Sex
female [38.0, 26.0, 35.0, 27.0, 14.0, 4.0, 58.0, 55.0...
male [22.0, 35.0, 29.0, 54.0, 2.0, 20.0, 39.0, 34.0...
Name: Age, dtype: object
Variável:
classification = df.groupby("Sex").Age.unique()
Now I imagine I have to make a for loop, is that it? But how to name each case.
To know the quantity just do
len(classification[i])
, i equals 0 for Female and 1 for Male. To sort, see if this link help you– AlexCiuffa