如何使用保存SVM模型进行预测

时间:2018-04-12 04:12:15

标签: python scikit-learn svm

参考How to use save model for prediction in python

上的帖子

当我加载并使用新数据进行预测时..我收到以下错误。

我们可以做些什么来解决它?

UnicodeEncodeError:'decimal'编解码器无法对位置510中的字符u'\ u2019'进行编码:无效的十进制Unicode字符串

我的整个代码......

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.naive_bayes import MultinomialNB
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline
X_train, X_test, y_train, y_test = train_test_split(df['IssueDetails'], df['CRST'], random_state = 0)
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(X_train)
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
clf = LinearSVC().fit(X_train_tfidf, y_train)
cif_svm = Pipeline([('tfidf', tfidf_transformer), ('SVC', clf)])

from sklearn.externals import joblib
joblib.dump(cif_svm, 'modelsvm.pk1')

Fitmodel = joblib.load('modelsvm.pk1')
Fitmodel.predict(df_v)

1 个答案:

答案 0 :(得分:0)

我找到了上述问题的答案。我使用下面的代码进行预测

datad['CRSTS']=datad['Detail'].apply(lambda x: unicode(clf.predict(count_vect.transform([x]))))