By importing a given CSV into the notebook jupyter and turning a variable to 0 and 1 hour to show the table again this as Nan

Asked

Viewed 44 times

-2

import pandas as pd

Train = pd.read_csv('database') Train.head()

inserir a descrição da imagem aqui

#Map sex and shape to 0 and 1 Train['Sex_num']= Train.Sex.map({'Female':0, 'Male':1})

#compare using the Loc function to locate the data in row 0 to 4 and the column Sex and Sex_num

Train.Loc[0:4, ['Fri', 'Sex_num']]

inserir a descrição da imagem aqui

  • 1

    If Sex can be female and male, all tiny, why in the map you put Female and Male?

  • That’s really it, thank you

1 answer

1

Though I see it’s been solved.

When we have a categorical variable, depending on the case, the use of "Category" is ideal.

Creating Test Dataframe

>>> df = pd.DataFrame({"Sex": ["male","male","female","male","female","female","male","male"]})
>>> df
      Sex
0    male
1    male
2  female
3    male
4  female
5  female
6    male
7    male

Applying category

>>> df["Category"] = df["Sex"].astype("category")
>>> df["Category"].cat.categories = [0,1]
>>> df
      Sex Category
0    male        1
1    male        1
2  female        0
3    male        1
4  female        0
5  female        0
6    male        1
7    male        1

Edited on 25/3/2021

Based on the @Woss question: "How would you define which is 0 and which is 1?"

Answer: The definition is in alphabetical order. That’s why 0 is associated with Female and 1 to but

See another example:

New test base

>>> df = pd.DataFrame({"Sex": ["outro", "male","male","female","male","outro","female","female","outro","male","male","outro"]})
>>> df
       Sex
0    outro
1     male
2     male
3   female
4     male
5    outro
6   female
7   female
8    outro
9     male
10    male
11   outro

Applying categories

>>> df["Category"] = df["Sex"].astype("category")
>>> df["Category"].cat.categories = [0,1,2]
>>> df
       Sex Category
0    outro        2
1     male        1
2     male        1
3   female        0
4     male        1
5    outro        2
6   female        0
7   female        0
8    outro        2
9     male        1
10    male        1
11   outro        2

Note that even if another is the first item, it gets category 2. If you want to associate in another order, something like [0,2,1] would lead to 0=Female, 2=Male, 1=other

  • What is the rule that the pandas would use in this case to determine the mapping? In Sex we have ['male', 'female'] and would be categorized as [0, 1]. How would be defined which will be 0 and which will be 1?

  • Very good your question. The order is alphabetical. I updated the post.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.