Application of linear regression

Question

Application of linear regression

Asked 7 years ago

Viewed 224 times

0

I have two lists

print(lista)
[970084.4148727012, 983104.7719906792, 996164.0, 1006426.5111488493, 1016687.0370821969, 1026941.5758164332, 1037185.9604590479, 1047415.8544247652, 1057626.746645888, 1067813.94679318, 1077972.5805253708, 1088097.584787312, 1098183.7031788095, 1147832.9385862947, 1195602.90322828, 1281768.5077875573]

print(new_list)
[3161, 3185, 3164, 3152, 3154, 3146, 3144, 3174, 0, 0, 0, 0, 0, 0, 0, 0]

I want to apply linear regression to predict the values that are 0 in new_list, for this I selected only the first 8 items:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X = lista[:8]
y = new_list[:8]

I’ve separated the data for training and testing

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=2)

And applied linear regression:

regr = LinearRegression() 
regr.fit(X_train, y_train)

But there was a mistake:

Valueerror: Expected 2D array, got 1D array Instead: array=[ 970084.4148727 983104.77199068 996164. 1006426.51114885 1016687.0370822 1026941.57581643 1037185.96045905 1047415.85442477]. Reshape your data either using array.reshape(-1, 1) if your data has a single Feature or array.reshape(1, -1) if it contains a single sample.

What Should I Do?

1

Hey, how about doing the [tour] as I already suggested in your other question? You still haven’t learned how to use all the tools on the site, but keep posting questions. Please do the [tour] to learn at least the basics of the operation.

– Woss

2018/08/09 at 20:10
Sorry my ignorance, but I do not understand what is wrong in my question, I’ve seen the tour and in my view there is nothing wrong, I hope I can be more specific, because if something I am doing wrong, I will fix at the same time.

– user9080886

2018/08/09 at 20:21
At the moment there is not, as it has been edited. Use the bass accents only to format inline codes. For code snippets, just indent with 4 blank spaces. For ease, the editor has the button {} entering the selected code.

– Woss

2018/08/09 at 20:24
I appreciate the clarification, and I’m sorry for the error.

– user9080886

2018/08/09 at 20:26

1 answer

Browser other questions tagged python

You are not signed in. Login or sign up in order to post.

by Pedro von Hertwig Batista • **3,434** points · Answer 1 · 2018-08-10T00:11:24+00:00

The sklearn supposes that your data X be a list list, because otherwise it has no way to distinguish between a dataset of, for example, 8 Features and 1 example and a dataset of 1 Feature and 8 examples.

To solve this, you can turn your list into a list of lists:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
lista = [[elemento] for elemento in lista]
X = lista[:8]
y = new_list[:8]

...

Or use the numpy with reshape as suggested by the error message (Reshape your data using array.reshape(-1, 1) if your data has a single feature):

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X = lista[:8]
X = np.array(X).reshape(-1, 1)
y = new_list[:8]

...