StackingClassifier导致“ ValueError:输入形状错误”错误

时间:2020-07-22 18:10:36

标签: python numpy scikit-learn classification

我正在使用scikit的StackingClassifier运行分类器,并且遇到无法解决的错误。这是代码:

testset = pd.read_csv('testset_200.csv').fillna(0.0)
X = testset.iloc[:, 0]
y = testset.iloc[:, 1:10]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, shuffle = True)

vec = TfidfVectorizer(stop_words = 'english')
X_train = vec.fit_transform(X_train)
X_test = vec.transform(X_test)

estimators = [('rcv', RidgeCV()), 
             ('rfc', RandomForestClassifier(n_estimators = 10))]

classifier = StackingClassifier(estimators = estimators, final_estimator = LogisticRegression())
y_pred = classifier.fit(X_train, y_train)
y_pred = y_pred.predict(X_test)

y_pred = np.argmax(y_pred, axis = 1)
y_test = np.argmax(y_test.values, axis = 1)

homogeneity_score(y_pred, y_test)

这是我收到的错误消息:

ValueError                                Traceback (most recent call last)
<ipython-input-433-4bc84f21b661> in <module>
      3 
      4 classifier = StackingClassifier(estimators = estimators, final_estimator = LogisticRegression())
----> 5 y_pred = classifier.fit(X_train, y_train)
      6 y_pred = y_pred.predict(X_test)
      7 

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in fit(self, X, y, sample_weight)
    409         """
    410         check_classification_targets(y)
--> 411         self._le = LabelEncoder().fit(y)
    412         self.classes_ = self._le.classes_
    413         return super().fit(X, self._le.transform(y), sample_weight)

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/sklearn/preprocessing/_label.py in fit(self, y)
    233         self : returns an instance of self.
    234         """
--> 235         y = column_or_1d(y, warn=True)
    236         self.classes_ = _encode(y)
    237         return self

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/sklearn/utils/validation.py in column_or_1d(y, warn)
    795         return np.ravel(y)
    796 
--> 797     raise ValueError("bad input shape {0}".format(shape))
    798 
    799 

ValueError: bad input shape (139, 9)

任何帮助将不胜感激。注意,当我将“分类器”变量设置为仅与RidgeCV()线性模型相等时,可能会有所提示,并且代码可以正确运行。

0 个答案:

没有答案