我正在尝试在scikit-learn上执行first exercise,但即使我运行他们的solution code(如下所示),我也会立即得到代码块中的错误。有谁知道为什么会这样?我该如何解决这个问题?
当尝试使用此数据集时,预测方法也会失败,出于某种原因,使用问题最底部的代码,它似乎对虹膜数据集工作正常。对不起,如果我遗漏了一些非常明显的东西,我不是真正的程序员。
Traceback (most recent call last):
File "C:\Users\user2491873\Desktop\scikit_exercise.py", line 30, in <module>
print(knn.fit(X_train, y_train).score(X_test, y_test))
File "C:\Python33\lib\site-packages\sklearn\base.py", line 279, in score
return accuracy_score(y, self.predict(X))
File "C:\Python33\lib\site-packages\sklearn\neighbors\classification.py", line 131, in predict
neigh_dist, neigh_ind = self.kneighbors(X)
File "C:\Python33\lib\site-packages\sklearn\neighbors\base.py", line 254, in kneighbors
warn_equidistant()
File "C:\Python33\lib\site-packages\sklearn\neighbors\base.py", line 33, in warn_equidistant
warnings.warn(msg, NeighborsWarning, stacklevel=3)
File "C:\Python33\lib\idlelib\PyShell.py", line 59, in idle_showwarning
file.write(warnings.formatwarning(message, category, filename,
AttributeError: 'NoneType' object has no attribute 'write'
这是代码:
"""
================================
Digits Classification Exercise
================================
This exercise is used in the :ref:`clf_tut` part of the
:ref:`supervised_learning_tut` section of the
:ref:`stat_learn_tut_index`.
"""
from sklearn import datasets, neighbors, linear_model
digits = datasets.load_digits()
X_digits = digits.data
y_digits = digits.target
n_samples = len(X_digits)
X_train = X_digits[:.9 * n_samples]
y_train = y_digits[:.9 * n_samples]
X_test = X_digits[.9 * n_samples:]
y_test = y_digits[.9 * n_samples:]
knn = neighbors.KNeighborsClassifier()
logistic = linear_model.LogisticRegression()
print('KNN score: %f' % knn.fit(X_train, y_train).score(X_test, y_test))\
print('LogisticRegression score: %f'
% logistic.fit(X_train, y_train).score(X_test, y_test))
这是Iris数据集的代码,似乎工作得很好......
import numpy as np
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> iris_X = iris.data
>>> iris_y = iris.target
>>> np.unique(iris_y)
array([0, 1, 2])
>>> # Split iris data in train and test data
>>> # A random permutation, to split the data randomly
>>> np.random.seed(0)
>>> indices = np.random.permutation(len(iris_X))
>>> iris_X_train = iris_X[indices[:-10]]
>>> iris_y_train = iris_y[indices[:-10]]
>>> iris_X_test = iris_X[indices[-10:]]
>>> iris_y_test = iris_y[indices[-10:]]
>>> # Create and fit a nearest-neighbor classifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn = KNeighborsClassifier()
>>> knn.fit(iris_X_train, iris_y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, n_neighbors=5, p=2,
warn_on_equidistant=True, weights='uniform')
>>> knn.predict(iris_X_test)
array([1, 2, 1, 0, 0, 0, 2, 1, 2, 0])
>>> iris_y_test
array([1, 1, 1, 0, 0, 0, 2, 1, 2, 0])
答案 0 :(得分:6)
如果您阅读了回溯消息,则表示表达式file
中的变量file.write(warnings.formatwarning(message, category, filename, ...)
设置为None
而不是预期的通道(例如程序的标准输出或用户界面中的缓冲区。)
这意味着这可能是IDLE中的一个错误。如果你谷歌的错误消息,你会得到:
http://bugs.python.org/issue18030
反过来指向:
http://bugs.python.org/issue13582
所以这个bug确实与scikit-learn无关。我建议你:
通过输入cmd
python -m idlelib.idle
控制台启动IDLE
或使用其他Python IDE /环境。