在使用roc-AUC选择了一些功能并使用基准线删除了我不需要的功能之后,我尝试使用Gradient Boosted Machine拟合模型。然后我尝试使用GBM调整火车组,但收到错误消息。
我实施了GBM
# lets drop roc-auc values below 0.54 baseline
x_train.drop(labels=removed_roc_values, axis=1, inplace=True)
x_test.drop(labels=removed_roc_values, axis=1, inplace=True)
x_train.shape, x_test.shape
The output of shape after dropping baseline features:((4930, 17), (2113, 23))
# using baseline GBM without tunning
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import classification_report
from sklearn.grid_search import GridSearchCV
baseline = GradientBoostingClassifier(learning_rate=0.1,
n_estimators=100,max_depth=3, min_samples_split=2, min_samples_leaf=1,
subsample=1,max_features='sqrt', random_state=10)
baseline.fit(x_train,y_train)
predictors=list(x_train)
feat_imp = pd.Series(baseline.feature_importances_,
predictors).sort_values(ascending=False)
feat_imp.plot(kind='bar', title='Importance of Features')
plt.ylabel('Feature Importance Score')
print('Accuracy of the GBM on test set: {:.3f}'.format(baseline.score(x_test,
y_test)))
pred=baseline.predict(x_test)
print(classification_report(y_test, pred))
我希望获得分类报告,相反,出现以下错误
ValueError:模型的特征数量必须与输入匹配。 型号
n_features为17,输入n_features为23
谢谢。