我有一个从sframe获得的两个numpy数组x和y,其中x有6个维度,y(目标变量)有一个维度。
x =np.array([[ 0 , 0 , 0, 24 ,0, 34], [ 0 , 0 , 0, 22 ,0, 34], ...])
y = np.array([[0], [0], [0], [1], [1], ...])
我正在使用scikit-learn应用朴素贝叶斯分类器。当我尝试在天真的贝叶斯分类器中拟合x和y时,我给出了以下错误:
/home/.../local/lib/python2.7/site-packages/sklearn/utils/validation.py:526: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
Traceback (most recent call last):
File "main_naive.py", line 10, in <module>
main()
File "main_naive.py", line 7, in main
naive_bayes.predict()
File "/home/.../naive_bayes_model.py", line 184, in predict
self.naive_bayes.fit(x, y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 566, in fit
Y = labelbin.fit_transform(y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/base.py", line 494, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/home/.../local/lib/python2.7/site-packages/sklearn/preprocessing/label.py", line 304, in fit
self.classes_ = unique_labels(y)
File "/home/.../local/lib/python2.7/site-packages/sklearn/utils/multiclass.py", line 98, in unique_labels
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array([0, 0, 0, ..., 0, 0, 0], dtype=object),)
这是我的代码:
from sklearn.naive_bayes import BernoulliNB
naive_bayes = BernoulliNB(alpha=1e-2)
#x = self.training1[self.feature_columns].to_numpy()
#x = x.reshape(-len(self.feature_columns), len(self.feature_columns))
#y = self.training1[[target_column]].to_numpy()
#y = y.reshape(-1L,1L)
x =np.array([[ 0 , 0 , 0, 24 ,0, 34], [ 0 , 0 , 0, 22 ,0, 34], ...])
y = np.array([[0], [0], [0], [1], [1], ...])
naive_bayes.fit(x, y)
我哪里错了?
答案 0 :(得分:0)
我想出了这个问题。这是因为y包含'None'值,所以我只是从y中删除了None值。