XGBoost使用sklearn API获取predict_contrib?

时间:2018-04-06 16:43:32

标签: python scikit-learn xgboost

在Python中,XGBoost允许您使用他们的Booster类或使用他们的sklearn API(http://xgboost.readthedocs.io/en/latest/python/python_api.html)来训练/预测。我正在使用sklearn API,并希望使用XGBoost的pred_contribs功能。我希望这可行,但它没有:

model = xgb.XGBClassifier().fit(X_train, y_train)
pred = model.predict_proba(X_test, pred_contribs=True)

看起来pred_contribs只是Booster类预测函数的参数。如何通过sklearn API使用此参数?或者是否有一个简单的解决方法可以在使用sklearn API进行培训后获得预测贡献者?

1 个答案:

答案 0 :(得分:2)

您可以使用XGBClassifier中的get_booster()方法,该方法将在XGBC分类器配备训练数据后返回Booster对象。

之后,您只需使用predict()在Booster对象上调用pred_contribs = True即可。

示例代码:

from xgboost import XGBClassifier, DMatrix
from sklearn.datasets import load_iris

iris_data = load_iris()

# Taking only first 100 samples to make this a binary problem, 
# else it will be multi-class and shape of pred_contribs will change
X, y = iris_data.data[:100], iris_data.target[:100]

# This data has 4 features
print(X.shape)
Output: (100, 4)


clf = XGBClassifier()
clf.fit(X, y)

# This is what you need
booster = clf.get_booster()


# Using only a single sample for predict, you can use multiple
test_X = [X[0]]

# Wrapping the test X into a DMatrix, need by Booster
predictions = booster.predict(DMatrix(test_X), pred_contribs=True)

print(predictions.shape)

# Output has 5 columns, 1 for each feature, and last for bias
Output: (1, 5)