1
Hello, I have a somewhat unbalanced dataset and wanted to do some testing with smote, but I’m getting an error:
library(DMwR)
treinoSmote <- SMOTE(TARGET ~ .,m,k=5, perc.over = 100, perc.under = 200) Error in factor(newCases[, a], levels = 1:nlevels(data[, a]), labels = levels(data[, : invalid 'labels'; length 0 should be 1 or 2
my TARGET is already a factor, I left it with values 1 and 0, with S and N (YES and NO) etc, always gives that error.
my dataset is composed of integer Features, factor and Numeric. There are about 20 at the moment.
The only things I see on the Internet say it must be factor and stuff, but it’s over!
I did the test that you have in the documentation of the SMOTE itself with the iris dataset and it works normal. I checked the type of the Feature and is as a factor as well. I don’t understand why you’re making this mistake.
data(iris)
data <- iris[,c(1,2,5)]
data$Species <- factor(ifelse(data$Species == "setosa", "rare", "common"))
table(data$Species)
common rare
100 50
newData <- SMOTE(Species ~ ., data, perc.over = 600, perc.under = 100)
table(newData$Species)
common rare
300 350
It would be interesting to provide your database (or a part of it) to reproduce in the same way. Use the command
dput
– Rafael Cunha
so serve? https://anotepad.com/notes/ind9p6 rsrs did not find place to attach or something like that
– Geovani Ferreira
It is also worth saying which package used, since the function
SMOTE
is not a base R function. In these cases start at all times the question with the package loading,library(DMwR)
.– Rui Barradas
sorry, I had not seen that I had not put, I already edited there
– Geovani Ferreira
pqp found the error. It had a Feature that had not converted correctly. Sorry the disorder. the F8 should be factor and was Character
– Geovani Ferreira