python - StratifiedKFold输出处理

时间：2016-07-23 08:30:06

标签： python machine-learning scikit-learn cross-validation

我有一个20行60列的系列，其中20个例子各有60个参数。

kfold = StratifiedKFold（y = encoded_Y，n_folds = 10，shuffle = True，random_state = seed） The output consists of two columns

我想知道第二列的含义是什么，以及它选择两个索引的基础。为什么不采取三个指标？

Furthur，我想知道交叉验证功能如何将此系列作为＆＃34; cv＆＃34;的输入。论点。＆＃34; CV＆＃34;通常是一个整数。

results = cross_val_score（estimator，X，encoded_Y，cv = kfold）

答案 0 :(得分：0)

与sklearn.cross_validation中的所有交叉验证器一样，这是一对索引的迭代器。在每对中，第一项是列车索引列表，第二项是测试索引列表。

在the example you bring中，第一项包含一对，其中除了1,17之外的所有内容都是列车索引，而17,17是测试索引。