Question

我有一个如下所示的管线模型；

pipeline = Pipeline(steps= [
        ('imputer', get_imputer(
            categorical_features=categorical_features,
            real_features=real_features,
            int_features=int_features,
        )),
        ('classifier', RandomForestClassifier(criterion='gini', class_weight='balanced')),
    ])
print(int_features)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)
y_pred = pipeline.fit(x_train, y_train).predict(x_test)

其中，getimputer是我创建的自定义函数。

def get_imputer():
     some function
     return result

现在，我需要找出预测说明。为此，我使用了Shap Kernal Explaner。

X = pipeline.named_steps[imputer].fit_transform(X)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)


# use Kernel SHAP to explain test set predictions

shap.initjs()

explainer = shap.KernelExplainer(pipeline.named_steps['classifier'].predict_proba, x_train, link="logit")
shap_values = explainer.shap_values(x_test, nsamples=10)

# # plot the SHAP values for the Setosa output of the first instance
shap.force_plot(explainer.expected_value[0], shap_values[0][0,:], x_test.iloc[0,:], link="logit")

使用方法正确吗？还是有什么方法可以像在GridserachCV中那样整体使用管道。

如何使用Python Pipeline模型使用Shap Explainer

0 个答案: