AttributeError:概率估计不适用于loss ='hinge'

时间:2016-08-29 07:00:34

标签: python scikit-learn

text_clf = Pipeline([('vect',CountVectorizer(decode_error='ignore')),
                      ('tfidf',TfidfTransformer()),
                      ('clf',SGDClassifier(loss = 'hinge',penalty = 'elasticnet',alpha = 1e-3,n_iter = 10, random_state = 40))])

text_clf = text_clf.fit(trainDocs+valDocs,np.array(trainLabels+valLabels))
predicted = text_clf.predict_proba(testDocs)

如何获得每个测试样本的预测概率?谢谢!

2 个答案:

答案 0 :(得分:2)

SGDClassifier(loss = 'hinge')默认没有概率。

您必须将SGDclassifier(loss = 'hinge')传递给CalibratedClassifierCV(),这将计算SGDclassifier(loss = 'hinge')的概率值。

lr = SGDClassifier(loss='hinge',alpha=best_alpha,class_weight='balanced')
clf =lr.fit(X_tr, y_train)
calibrator = CalibratedClassifierCV(clf, cv='prefit')
model=calibrator.fit(X_tr, y_train)

y_train_pred = model.predict_proba(X_tr)
y_test_pred = model.predict_proba(X_te)

答案 1 :(得分:0)

您可以使用decision_function而不是predict_proba来获取预测值。