我想使用Python中的随机森林分类器学习Python来预测库存变动。我的数据集有8个功能和1201条记录。但在拟合模型并使用它进行预测后,它显示出100%的准确率和100%的OOB误差。我将n_estimators从100修改为一个小值,但OOB错误刚刚下降了几个百分点。这是我的代码:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
#File reading
df = pd.read_csv('700.csv')
df.drop(df.columns[0],1,inplace=True)
target = df.iloc[:,8]
print(target)
#train test split
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size=0.3)
#model fit
clf = RandomForestClassifier(n_estimators=100, criterion='gini',oob_score= True)
clf.fit(X_train,y_train)
pred = clf.predict(X_test)
accuaracy = accuracy_score(y_test,pred)
print(clf.oob_score_)
print(accuaracy)
如何修改代码以使oob错误丢失?感谢。
答案 0 :(得分:0)
如果要检查错误,请像这样使用/修改代码:
Column