Python:sklearn kFold为关键字参数' shuffle'返回多个值。

时间:2018-02-22 16:17:32

标签: python scikit-learn cross-validation

我正在尝试使用kfold

与经典sklearn进行交叉验证
def train_and_evaluate(clf, X_train, y_train):
    clf.fit(X_train, y_train)
    # create a k-fold cross validation iterator of k=5 folds
    cv = KFold(int(X_train.shape[0]), 4, shuffle = True)  ## Classic KFold
    scores = cross_val_score(clf, X_train, y_train, cv=cv)
    return (clf, scores) 

X_train, X_test, y_train, y_test =  train_test_split(X, Y, test_size=0.20, random_state=42)
scaler  = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test  = scaler.transform(X_test)

但是我收到以下错误:

clf1, scores1 = train_and_evaluate(linear_model.SGDRegressor(), X_train, y_train)

TypeError: __init__() got multiple values for keyword argument 'shuffle'

2 个答案:

答案 0 :(得分:1)

KFold的功能签名如下所示

sklearn.model_selection.KFold(n_splits=3, shuffle=False, random_state=None)

所以当你传递这两个位置参数(int(X_train.shape[0]), 4)时,你将为参数shuffle传递4。然后,您也可以按名称传递shuffle,这样就可以获得多个参数错误。

我不清楚为什么要传递这两个位置参数,但我认为如果你想要一个4倍的分割,你只需要传递4个

答案 1 :(得分:0)

import numpy as np

x=np.arange(100)

from  sklearn.model_selection import KFold

kf=KFold(5,shuffle=True,random_state=None)

x=kf.split(X)

for i,j in x:

    print(i,j)