How to set the columns in Sklearn’s Onehotencoder?

Asked

Viewed 48 times

0

I’m having a small problem making an ML code for sorting set when using Onehotencoder to perform categorization. I am following a course that used the resource in the following way:

dados = pd.read_csv(...)

previsores = dados.iloc[:,0:14].values
classe = dados.iloc[:,14].values

I’ve separated the matrix between the predictor parameters and the class. Among these predictors I have some nominal categorical parameters (i.e., they have no hierarchical order).

labelencoder_previsores = LabelEncoder()
lista = [1,3,5,6,7,8,9,13]

for x in lista:
    previsores[:, x] = labelencoder_previsores.fit_transform(previsores[:, x])

Here I labeled them, taking them from the string format to make them numerical, however, as they do not have order I need to use Onehotencoder.

The problem is that in the course in question the teacher adds a parameter categorical_features to indicate which columns need this specific treatment, thus:

onehotencoder = OneHotEncoder(categorical_features=[1,3,5,6,7,8,9,13])
previsores = onehotencoder.fit_transform(previsores).toarray()

And apparently this attribute, categorical_features no longer exists, so I’m looking at how I could solve this problem in forums, in documentation, but this is the third ML algorithm I’ve done in my life, so it’s been really hard. Does anyone know how I could solve this?

  • André, good afternoon! Can you provide the data you are using? Have you looked at Columntransformer?

  • Yes, I am, yes! Here are: https://archive.ics.uci.edu/ml/machine-learning-databases/adult/ (Adult.data) onehotencoder = Columntransformer(Transformers=[('1', Onehotencoder(),[1,3,5,6,7,8,9,13])]) onehotencoder.fit_transform(previsors) I’m trying to understand how you use it. Could you give me a hand with that? Thanks in advance!

  • See this post, specifically the EDIT of the most voted reply.

  • Opa ct = Columntransformer(Transformers=[('oh_enc', Onehotencoder(sparse=False),[1,3,5,6,7,8,9,13])],remainder='passthrough') ct.fit_transform(previsors) I gave a print before and after using this column Transformer and the values are equal (as it had to be), how can I test to see if everything is okay? Thank you very much my friend!

  • OneHotEncoder(sparse=False).fit_transform(dados.iloc[:,[1,3,5,6,7,8,9,13]]) Doing directly would not solve your problem?

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.