Take the probability of belonging to each class

Question

Take the probability of belonging to each class

Asked 6 years, 3 months ago

Viewed 169 times

0

I have a theoretical problem where a store owner wants to know the chance of a particular phrase generating a sale, I have in hand a dictionary with 20 random words and 10 phrases formed by exactly 10 randomly chosen words within my dictionary:

Word dictionary :

I organized my sentences in a table and exchanged the words for ID’s to perform a classification test:

I tested some sklearn algorithms and what gave me the best result was CART:

treino_x, valid_x, treino_y, valid_y = (model_selection.train_test_split
                                        (x, y, test_size=valid, random_state=sementes))

pontuacao = 'accuracy'

modelos = []
modelos.append(('CART', DecisionTreeClassifier()))

resultado = []
nomes = []
for nome, modelo in modelos:
    kfold = model_selection.KFold(n_splits=10, random_state=sementes)
    cv_results = model_selection.cross_val_score(modelo, treino_x, treino_y, cv=kfold, scoring=pontuacao)
    resultado.append(cv_results)
    nomes.append(nome)
    msg = "%s: %f (%f)" % (nome, cv_results.mean(), cv_results.std())
    print(msg)


cart = DecisionTreeClassifier()
cart.fit(treino_x, treino_y)
predictions = cart.predict(valid_x)
print(accuracy_score(valid_y, predictions))

With this I can already simulate a new phrase any and use a Cart.predict to tell me if this phrase is sale or not, however instead of just returning me Sale / N I would like to know the probability that this phrase has generated a sale (Ex: 78%), at this point enters the weight of each of the Features, as I do to calculate this weight using Python ?

1 answer

Browser other questions tagged python machine-learning sklearn

You are not signed in. Login or sign up in order to post.

by AlexCiuffa • **2,402** points · Answer 1 · 2019-04-09T20:56:50+00:00

To know the percentage of belonging to each class, use the function .predict_proba().

She is similar to .predict(), but returns the probabilities to belong to each class in the form of an array. Return example: array([[0., 1.]]), means 0% chance to belong to the class A and 100% of belonging to the class B.

predictions = cart.predict_proba(valid_x)