Valueerror in Kfold from Scikit-Earn: My dataset has two classes! What’s going on?

Asked

Viewed 70 times

0

I tried to cross-validate with a logistic regression using the Scikit-Learn . Follows the code:

 kf = KFold(n_splits=5, random_state=None, shuffle=False)
    kf.get_n_splits(previsores)
    for train_index, test_index in kf.split(previsores):

        X_train, X_test = previsores[train_index], previsores[test_index]
        y_train, y_test = classe[train_index], classe[test_index]

        logmodel.fit(X_train, y_train)
        print (confusion_matrix(y_test, logmodel.predict(X_test)))


        lista_matrizes.append(confusion_matrix(y_test, logmodel.predict(X_test)))
    #print(f" Matriz de Confusão Média \n{np.mean(lista_matrizes, axis=0)}")
    print("Matriz de Confusão Média")
    print(np.mean(lista_matrizes, axis=0))

I’m getting the following error:

ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 1

My dataset has two classes (0 and 1) but I get the error above. What to do?

1 answer

1

This can happen due to the fact that one of the k-Folder folders took samples from only one class. Take a look at the size of your dataset and the size of the folders. a look if it is possible for a feeder to take only one class.

  • How can I check if "a printer only takes one class."? Could you detail more?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.