0
how can I use a One Class classifier, such as Isolation Forest, with Cross Validation? I’m trying to do it this way:
columns = data.columns.tolist()
columns = [c for c in columns if c not in ["Class"]]
target = "Class"
X = data[columns]
Y = data[target]
Fraud = data[data["Class"]==1]
Valid = data[data["Class"]==0]
outlier_fraction = 0.5
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.2)
modelIF = IsolationForest(max_samples=100,contamination = outlier_fraction,random_state=1)
modelIF.fit(X)
scores_pred=modelIF.decision_function(X)
y_pred=modelIF.predict(X)
#0 for valid and 1 for Fraud
y_pred[y_pred==1]=0
y_pred[y_pred==-1]=1
#metrics without cv
print(accuracy_score(Y,y_pred))
print(classification_report(Y,y_pred))
But even following the official documentation of sklearn
, the final result is being null nan
from sklearn.model_selection import cross_val_score
scores = cross_val_score(modelIF, Y, y_pred, scoring='accuracy', cv=5)
print(scores)
[nan nan nan nan nan]
Stallone, make your dataset available so people can reproduce what you’re doing. Hug!
– lmonferrari