0
I think it’s a simple question, but in all the courses I’m taking the instructor teaches you to separate training and test data from a csv or some base. But I want to test with the user input instead, but when I try, it says it needs to be the same size as the workouts, there’s no way to test only one input?
I am using the treatment_5 column with the following code
tfidf = TfidfVectorizer(lowercase=False)
vetor_tfidf = tfidf.fit_transform(resenha["tratamento_5"])
treino, teste, classe_treino, classe_teste = train_test_split(vetor_tfidf,
resenha["classificacao"],
random_state = 42)
regressao_logistica.fit(treino, classe_treino)
acuracia_tfidf = regressao_logistica.score(teste, classe_teste)
print(regressao_logistica.predict(teste).tolist())
This code separates test and training data and predicts with test data.
But I want to do something with user interaction, ie a text inserted by the user, I tried this way:
vetor_tfidf2 = tfidf.fit_transform(["Esse filme foi muito bom, gostei dos movimentos de ação do inicio até o final do filme"])
treino, teste, classe_treino, classe_teste = train_test_split(vetor_tfidf2,
resenha["classificacao"],
random_state = 42)
regressao_logistica.fit(treino, classe_treino)
acuracia_tfidf = regressao_logistica.score(teste, classe_teste)
print(regressao_logistica.predict(teste).tolist())
print(vetor_tfidf2.shape)
print(resenha['classificacao'].shape)
But the following error returns to me
Valueerror: Found input variables with inconsistent Numbers of samples: [1, 49459]
This seems to me to be because the training and testing data are different sizes, but how can I do it with just one sentence and not using the dataframe as I tried?