无法在加利福尼亚州住房数据集上执行sklearn.naive_bayes GaussianNB

时间:2019-11-15 04:39:58

标签: python-3.x machine-learning jupyter-notebook

尝试拟合数据集时遇到错误Unknown label type: (array([0.14999, 0.175 , 0.225 , ..., 4.991 , 5. , 5.00001]),)

from sklearn.datasets import fetch_california_housing
from sklearn.datasets import load_iris

cali = fetch_california_housing()

iris = load_iris()   
from sklearn.naive_bayes import GaussianNB


gnb = GaussianNB() # probabilistic
y_pred_cali = gnb.fit(cali.data, cali.target).predict(cali.data)

错误:

ValueError                                Traceback (most recent call last)
<ipython-input-23-71ed3304ef0f> in <module>
     14 
     15 gnb = GaussianNB() # probabilistic
---> 16 y_pred_cali = gnb.fit(cali[0], cali[1]).predict(cali[0])
     17 

~\Anaconda3\lib\site-packages\sklearn\naive_bayes.py in fit(self, X, y, sample_weight)
    189         X, y = check_X_y(X, y)
    190         return self._partial_fit(X, y, np.unique(y), _refit=True,
--> 191                                  sample_weight=sample_weight)
    192 
    193     @staticmethod

~\Anaconda3\lib\site-packages\sklearn\naive_bayes.py in _partial_fit(self, X, y, classes, _refit, sample_weight)
    351             self.classes_ = None
    352 
--> 353         if _check_partial_fit_first_call(self, classes):
    354             # This is the first call to partial_fit:
    355             # initialize various cumulative counters

~\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py in _check_partial_fit_first_call(clf, classes)
    318         else:
    319             # This is the first call to partial_fit
--> 320             clf.classes_ = unique_labels(classes)
    321             return True
    322 

~\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py in unique_labels(*ys)
     92     _unique_labels = _FN_UNIQUE_LABELS.get(label_type, None)
     93     if not _unique_labels:
---> 94         raise ValueError("Unknown label type: %s" % repr(ys))
     95 
     96     ys_labels = set(chain.from_iterable(_unique_labels(y) for y in ys))

ValueError: Unknown label type: (array([0.14999, 0.175  , 0.225  , ..., 4.991  , 5.     , 5.00001]),)

1 个答案:

答案 0 :(得分:0)

此数据集具有连续目标变量。

GNB是一种分类方法,而不是回归方法。 Y必须是离散类,而不是连续变量。