神经网络的Scikit列车测试问题

时间:2017-08-27 08:55:44

标签: scikit-learn

我正在尝试为虹膜数据训练神经网络模型。当我将训练和测试数据分成50%时,代码很好,但是当数据为60%用于训练而40%用于测试时我得到一个错误。这是我的代码

from sklearn.cross_validation import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state=1)

clf = MLPClassifier(solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(15,), random_state=1)


 clf.fit(X_train, y_train)  
    y_pred=clf.predict(X_test)  
    print(metrics.accuracy_score(y_train,y_pred))

这是错误

print(metrics.accuracy_score(y_train,y_pred))
Traceback (most recent call last):

  File "<ipython-input-51-aacc5d70d13b>", line 1, in <module>
    print(metrics.accuracy_score(y_train,y_pred))

  File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 172, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)

  File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 72, in _check_targets
    check_consistent_length(y_true, y_pred)

  File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\utils\validation.py", line 181, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])

ValueError: Found input variables with inconsistent numbers of samples: [90, 60]

1 个答案:

答案 0 :(得分:0)

如果使用y_test预测y_pred,您应该比较y_predX_test

print(metrics.accuracy_score(y_test,y_pred))

accuracy_score将检查两个给定数组中的多少匹配。只有它们具有相同的形状才有可能。如果将使用测试集的预测标签与训练集的真实值进行比较,您将会有相当随机的行为,因为它们彼此不对应。