Creating data set for sklearn with dataframe pandas

Question

Creating data set for sklearn with dataframe pandas

Asked 9 years ago

Viewed 401 times

2

I have the following situation

from sklearn.linear_model import LogisticRegression
import pandas as pd

x = pd.DataFrame({'A':[1,3,8,6,1],'B':[2,6,9,3,2]})
y = pd.DataFrame({'C':[8,6,3,6,1]})

How do I make it happen?

LogisticRegression( ).fit(x, y)

I have the following answer:

ValueError: Unknown label type: array([8,6,3,6,1]) #valores do y

Which way is right??

I don’t know much about python, but logistic regression is used p/ classification. So I would guess that y in your case should not be of the numerical type...

– Daniel Falbel

2016/07/01 at 17:16
Nooosa... I thought sklearn would do the same as excel Proj.log (logest).

– Mueladavc

2016/07/01 at 17:54

1 answer

Browser other questions tagged python pandas

You are not signed in. Login or sign up in order to post.

by Victor Capone • **414** points · Answer 1 · 2016-09-22T17:09:45+00:00

Sklearn’s Logistic regression is used for sorting and implementing that method that is able to distinguish between two different classes (eg.: sick and healthy), so that it works the call of the fit() method takes two parameters, the first are its "observations" in the form of a Matrix and the second an array with the classes corresponding to each of the observations, in your case I believe the problem is that you are passing a Dataframe as a second parameter, maybe you want to try this LogisticRegression().fit(x, y.C).

Note that this method is similar to the Excel Proj.log(logest) that approximates a curve to the form y=b*(M1 X1)*(m2 x2)...(Mn Xn). While logistic regression approaches a curve in this way.