python scikit学习逻辑回归错误

时间:2018-03-09 11:57:17

标签: python scikit-learn logistic-regression

我试图从以下数据中绘制逻辑回归图

X = np.array([0,1,2,3,4,5,6,7,8,9,10,11])
y = np.array([0,0,0,0,1,0,1,0,1,1,1,1])

然而,当我尝试:

import numpy as np
import matplotlib.pyplot as plt

from sklearn import linear_model

X = np.array([0,1,2,3,4,5,6,7,8,9,10,11])
y = np.array([0,0,0,0,1,0,1,0,1,1,1,1])

clf = linear_model.LogisticRegression(C=1e5)
clf.fit(X, y)

我收到以下错误:

ValueError: Found input variables with inconsistent numbers of samples: [1, 12]

我有点困惑为什么它认为X或y只有一个样本。

1 个答案:

答案 0 :(得分:2)

sklearn的现代版本期望2D数组为X,因此请尝试按照错误消息中的建议重新整形:

In [7]: clf.fit(X.reshape(-1,1), y)
Out[7]:
LogisticRegression(C=100000.0, class_weight=None, dual=False,
          fit_intercept=True, intercept_scaling=1, max_iter=100,
          multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
          solver='liblinear', tol=0.0001, verbose=0, warm_start=False)

顺便说一句,sklearn 0.19.1给了我一个明确的错误信息:

In [10]: sklearn.__version__
Out[10]: '0.19.1'

In [11]: clf.fit(X, y)
...
skipped
...
ValueError: Expected 2D array, got 1D array instead:
array=[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

更新:完整代码:

In [41]: %paste
import numpy as np
import matplotlib.pyplot as plt

from sklearn import linear_model
import sklearn

X = np.array([0,1,2,3,4,5,6,7,8,9,10,11])
y = np.array([0,0,0,0,1,0,1,0,1,1,1,1])

print('SkLearn version: {}'.format(sklearn.__version__))

clf = linear_model.LogisticRegression(C=1e5)
clf.fit(X.reshape(-1,1), y)

## -- End pasted text --
SkLearn version: 0.19.1
Out[41]:
LogisticRegression(C=100000.0, class_weight=None, dual=False,
          fit_intercept=True, intercept_scaling=1, max_iter=100,
          multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
          solver='liblinear', tol=0.0001, verbose=0, warm_start=False)