我正在尝试使用以下代码在SKlearn中训练逻辑回归模型:
preds = np.zeros((len(test), len(label_cols)))
for i, j in enumerate(label_cols):
print('fit', j)
model = linear_model.LogisticRegression(C=4, dual=True)
model.fit(train_x[5:10], list(train[j])[5:10])
preds[:,i] = m.predict_proba(test_x)[:,1]
我有一个多标签分类问题,train[j]
一次代表一个类来训练多个标签的LR模型。
但是我收到以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-66-4c73d23d3af1> in <module>()
3 model = linear_model.LogisticRegression(C=4, dual=True)
4 model.fit(train_x[5:10], list(train[j])[5:10])
----> 5 preds[:,i] = m.predict_proba(test_x)[:,1]
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py in predict_proba(self, X)
1334 calculate_ovr = self.coef_.shape[0] == 1 or self.multi_class == "ovr"
1335 if calculate_ovr:
-> 1336 return super(LogisticRegression, self)._predict_proba_lr(X)
1337 else:
1338 return softmax(self.decision_function(X), copy=False)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in _predict_proba_lr(self, X)
336 multiclass is handled by normalizing that over all classes.
337 """
--> 338 prob = self.decision_function(X)
339 prob *= -1
340 np.exp(prob, prob)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in decision_function(self, X)
298 "yet" % {'name': type(self).__name__})
299
--> 300 X = check_array(X, accept_sparse='csr')
301
302 n_features = self.coef_.shape[1]
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
431 force_all_finite)
432 else:
--> 433 array = np.array(array, dtype=dtype, order=order, copy=copy)
434
435 if ensure_2d:
ValueError: setting an array element with a sequence.
我已经针对同一问题阅读了其他答案,但无法调试我的问题。 X的所有实例的尺寸都是300尺寸:
len(list(filter(lambda x:len(x)!=300, train_x)))
>> True
和len of train_x(feature)等于train(标签):
len(train_x) == len(train)
>>True
我还测试了train_x中的所有值是否为float,也是True:
len(list(filter(lambda x: all(isinstance(n, float) for n in x), train_x))) == len(train_x)
>> True