获得单个预测值scikit-learn

时间:2016-01-20 15:00:40

标签: python scikit-learn

我试图构建一个混淆矩阵,需要检索随机森林预测的值。目前,我正在调用交叉验证功能,该功能仅返回分数。

我知道confusion_matrix(y_true, y_pred[, labels])中有一个sklearn.metrics函数,我也有y_true(请参阅y中的prepareDataset(..))。

但我也需要y_pred。我的相关代码段:

    ...

    # Define classifier.
    rfc = RandomForestClassifier(n_estimators=self.n_estimators)
    # Explicitly define cv
    cv = StratifiedKFold(y, self.cv_folds)
    # Write out the actual splits
    ts = time.time()
    outlist = list(cv)

    with open(outPath, 'wb') as out:
        pickle.dump(outlist,out)

    # Perform crossvalidation.
    cv_scores = cross_validation.cross_val_score(rfc, X, y, cv=cv, n_jobs = 1)#, n_jobs=-1)
    # Calculate score and standard deviation.
    score = np.mean(cv_scores)
    std = np.std(cv_scores)

    ...

def prepareDataset(self, dataset):
    """" Splits the dataset in training- and target-dataset. """
    X = np.delete(dataset, dataset.shape[1] - 1, 1) # Training attributes.
    y = dataset[:,len(dataset[0]) - 1] # Training target.

那么我怎样才能获得用confusion_matrix(...)提供的预测列表?

或者还有其他更简单的方法吗?提前谢谢。

1 个答案:

答案 0 :(得分:1)

您可以使用cross_validation.cross_val_predict。它看起来像这样:

y_pred = cross_validation.cross_val_predict(rfc, X, y, cv=cv, n_jobs = 1)