0
I’m having a small problem making an ML code for sorting set when using Onehotencoder to perform categorization. I am following a course that used the resource in the following way:
dados = pd.read_csv(...)
previsores = dados.iloc[:,0:14].values
classe = dados.iloc[:,14].values
I’ve separated the matrix between the predictor parameters and the class. Among these predictors I have some nominal categorical parameters (i.e., they have no hierarchical order).
labelencoder_previsores = LabelEncoder()
lista = [1,3,5,6,7,8,9,13]
for x in lista:
previsores[:, x] = labelencoder_previsores.fit_transform(previsores[:, x])
Here I labeled them, taking them from the string format to make them numerical, however, as they do not have order I need to use Onehotencoder.
The problem is that in the course in question the teacher adds a parameter categorical_features
to indicate which columns need this specific treatment, thus:
onehotencoder = OneHotEncoder(categorical_features=[1,3,5,6,7,8,9,13])
previsores = onehotencoder.fit_transform(previsores).toarray()
And apparently this attribute, categorical_features no longer exists, so I’m looking at how I could solve this problem in forums, in documentation, but this is the third ML algorithm I’ve done in my life, so it’s been really hard. Does anyone know how I could solve this?
André, good afternoon! Can you provide the data you are using? Have you looked at Columntransformer?
– lmonferrari
Yes, I am, yes! Here are: https://archive.ics.uci.edu/ml/machine-learning-databases/adult/ (Adult.data) onehotencoder = Columntransformer(Transformers=[('1', Onehotencoder(),[1,3,5,6,7,8,9,13])]) onehotencoder.fit_transform(previsors) I’m trying to understand how you use it. Could you give me a hand with that? Thanks in advance!
– André
See this post, specifically the EDIT of the most voted reply.
– Paulo Marques
Opa ct = Columntransformer(Transformers=[('oh_enc', Onehotencoder(sparse=False),[1,3,5,6,7,8,9,13])],remainder='passthrough') ct.fit_transform(previsors) I gave a print before and after using this column Transformer and the values are equal (as it had to be), how can I test to see if everything is okay? Thank you very much my friend!
– André
OneHotEncoder(sparse=False).fit_transform(dados.iloc[:,[1,3,5,6,7,8,9,13]])
Doing directly would not solve your problem?– lmonferrari