-1
I’m trying to create a program to read the data set, sort it by columns and then remove atypical values, but my function is not working because I’m using a new version of numpy 0.25, so the rating doesn’t work, someone can help me with this problem?
def remove_outlier(df):
list = ['Unnamed: 32', 'diagnosis', 'id']
x = df.drop(list, axis=1)
# x.head()
# df = x.sort_values
x.sort(axis=0) **<---- HERE IS THE PROBLEM, I THINK**
x = pd.DataFrame(df, index=x.index, columns=x.columns)
x.loc[:, :]
print(x)
Q1 = x.quantile(0.25)
Q3 = x.quantile(0.75)
IQR = Q3 - Q1
print(IQR)
number_outliers = (x < (Q1 - 1.5 * IQR)) | (x > (Q3 + 1.5 * IQR))
number_outliers.head(-1)
remove_outliers = x[~((x < (Q1 - 1.5 * IQR)) | (x > (Q3 + 1.5 * IQR))).any(axis=1)]
df1 = remove_outliers.head(-1)
return df1
Hello @Vinicius reis all right? Ask your question in English so we can help you.
– Izak Mandrak