我在分类问题上应用了predict_proba。我有一些用R建立分类模型的经验,但这是我第一次使用Python的sklearn。
所以问题是:在拟合后的sklearn中,我找不到访问概率的方法。可能吗?有一个方法predict_proba(),但是...顾名思义,它是预测。这是我的代码:
import pandas as pd
from sklearn.svm import SVC
from sklearn.svm import LinearSVC
import pickle
from nltk.tokenize import sent_tokenize
from Model import SkripsiPipeline
def konten(kata, model):
item = []
loaded_model = pickle.load(open(model, 'rb'))
for v in kata.itertuples(index = False):
sentiment = []
variabel1 = v[0]
variabel2 = v[1]
kalimat = variabel1 + variabel2
hasil_tokenize = sent_tokenize(kalimat)
preds = loaded_model.predict(hasil_tokenize)
if preds == 1:
proba = loaded_model.predict_proba(hasil_tokenize)
proba = proba.reshape(-1, 1).tolist()
sentiment.append('Positif')
sentiment.append(proba[0])
elif preds == 0:
proba = loaded_model.predict_proba(hasil_tokenize)
proba = proba.reshape(-1, 1).tolist()
sentiment.append('Netral')
sentiment.append(proba[1])
elif preds == -1:
proba = loaded_model.predict_proba(hasil_tokenize)
proba = proba.reshape(-1,1).tolist()
sentiment.append('Negatif')
sentiment.append(proba[2])
item.append(sentiment)
return item
但是我遇到了这个错误:
AttributeError: 'SkripsiPipeline' object has no attribute 'predict_proba'
这是SkripsiPipeline代码:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.model_selection import KFold
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
import pickle
class SkripsiPipeline():
def __init__ (self, predictor):
self.predictor = predictor
def fit(self,X,y):
vectorizer = CountVectorizer()
tfidf_transformer = TfidfTransformer()
svm_predictor = self.predictor
X = vectorizer.fit_transform(X)
X = tfidf_transformer.fit_transform(X)
svm_predictor.fit(X,y)
self.vectorizer = vectorizer
self.tfidf_transformer = tfidf_transformer
self.svm_predictor = svm_predictor
def predict (self, X):
X = self.vectorizer.transform(X)
X = self.tfidf_transformer.transform(X)
prediction = self.svm_predictor.predict(X)
return prediction
我是Python Sklearn软件包的新手。谁能告诉我我的Python代码有什么问题。我已经用谷歌搜索了,但无法正确理解。
答案 0 :(得分:0)
您调用predict_proba
类中不存在的SkripsiPipeline
方法。您应该实现类似于predict
方法的方法,但是它是从predict_proba
而不是其svm_predictor
方法中调用predict
的。
它应该看起来像这样:
def predict_proba(self, X):
X = self.vectorizer.transform(X)
X = self.tfidf_transformer.transform(X)
proba = self.svm_predictor.predict_proba(X)
return proba