0
I’m working on a python linear regression project, but there’s been a problem with the.fit() model. the following errors occur:
in the code I put here:
Valueerror: Input contains Nan, Infinity or a value Too large for dtype('float64').
when I try to set:
Valueerror: could not Convert string to float: 'e'
I’ve searched the Internet, but nothing says "and" I have tried to convert int(float(x)) and the numbers are all floats in the number.0 format, with no decimal places. Some of the numbers are 0.0 and others are high values. here is the code for analysis:
mport pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
#imports
dataset = pd.read_csv('movies_metadata.csv')
data = dataset.columns
data =dataset[['title','budget','revenue','vote_average']]
#seleção dos dados
custo = []
for i in data['budget']:
try:
custo.append(int(i))
except ValueError:
custo.append(0)
custo = pd.Series(custo)
data['custo'] = custo
data.drop(['budget'],axis = 1)
data.dropna()
#modelo de machine learning
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import sklearn.metrics
X = data['custo'].values.reshape(-1,1)
Y =data['revenue'].values.reshape(-1,1)
treino_x, teste_x, treino_y, teste_y = train_test_split(X,Y,random_state = 101,train_size = 0.27)
lista = [treino_x, teste_x, treino_y, teste_y]
modelo = LinearRegression()
modelo.fit(teste_x,teste_y)