我已经完成了功能工程,并训练了我的XGBoost预测模型。我的准确度达到了81%,但我不知道如何建立混淆矩阵(例如:打印真实的客户流失率1-0与预测性流失率)。这可能吗?我想知道“好预测”的总和(正确的肯定和否定的真实)和“错误的预测”的总数量(错误的肯定和错误的否定)。我在下面分享我的代码。预先感谢。
### Training XGBoost Model
xgb_train = xgb.DMatrix(X_train, label=y_train)
clf = xgb.XGBClassifier(
max_depth = 4,
n_estimators = 600,
learning_rate = 0.05,
nthread = 4,
colsample_bytree = 0.5,
min_child_weight = 5,
seed = 0)
## Cross Validation
cv = xgb.cv(clf.get_xgb_params(), xgb_train,
num_boost_round=600,
early_stopping_rounds=50,
nfold=5,
metrics=['auc'],
seed=0)
# Fitting model with best parameters
clf.set_params(n_estimators=cv.shape[0])
clf.fit(X_train, y_train, eval_metric='auc')
# Calculate AUC in train and test.
auc_train = roc_auc_score(y_train, clf.predict_proba(X_train)[:,1])
y_proba_xgb = clf.predict_proba(X_test)[:,1]
auc_test = roc_auc_score(y_test, y_proba_xgb)
print(f"TRAIN: {auc_train}, TEST: {auc_test}")