Question

我不明白何时必须使用scikit的fit方法学习。

在此网页中：http://machinelearningmastery.com/automate-machine-learning-workflows-pipelines-python-scikit-learn/ 有一个管道+ StandardScaler的例子。不使用拟合方法。

但在另一个：http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html 还有一个StandardScaler，有一种适合的方法。

这是我的代码：Pipeline + Robustscaler：

result_list = []

for name in ["AWA","Rem","S1","S2","SWS","SX", "ALL"]: 
    x=sio.loadmat('/home/{}_E.mat'.format(name))['x'] 
    s_y=sio.loadmat('/home/{}_E.mat'.format(name))['y']
    y=np.ravel(s_y)

    print(name, x.shape, y.shape) 
    print("")

    #Create a pipeline
    clf = make_pipeline(preprocessing.RobustScaler(), SVC(cache_size=1000, kernel='rbf'))


    ###################10x20 SSS##################################
    print("10x20")
    xSSSmean20 = []
    for i in range(10):
        sss= StratifiedShuffleSplit(y, 20, test_size=0.1, random_state=i)
        scoresSSS=cross_validation.cross_val_score(clf, x, y, cv=sss)

        xSSSmean20.append(scoresSSS.mean()) 

     result_list.append(xSSSmean20)

     print("")

Answer 1

要训练您的分类器，必须将其纳入您的训练数据集。

第一个链接也是这样做的，并不是因为它没有在片段中明确显示它没有这样做：

方法cross_val_score使用model，它是适合数据的估算工具。

看看方法'cross_val_score'的实现，并尝试理解它是如何工作的，而不是在不理解它的作用的情况下使用它。

Here是该函数的文档，here GitHub 中的实现可以引用。

建议：

当您不理解某些内容时，请尝试深入挖掘代码。你会学到很多东西！

我何时必须使用scikit的fit方法学习？

1 个答案: