在NaiveBayes文本分类中的反馈

时间:2017-08-02 12:28:42

标签: machine-learning nlp classification text-classification

我是机器学习的新手,我正在构建一个投诉分类程序,我想提供一个反馈模型,以便它可以随着时间的推移而改进

import numpy
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
value=[
'drought',
'robber',
]
targets=[
'water_department',
'police_department',
]
classifier = MultinomialNB()        
vectorizer = CountVectorizer()
counts = vectorizer.fit_transform(value)

classifier.partial_fit(counts[:1], targets[:1],classes=numpy.unique(targets))
for c,t in zip(counts[1:],targets[1:]):
    classifier.partial_fit(c, t.split())

value.append('dogs')                                   #new value to train
targets.append('animal_department')                    #new target
vectorize = CountVectorizer()
counts = vectorize.fit_transform(value)
print counts
print targets
print vectorize.vocabulary_
####problem lies here
classifier.partial_fit(counts["""dont know the index of new value"""], targets[-1])
####problem lies here

即使我以某种方式找到新插入值的索引,它也会给出错误

ValueError: Number of features 3 does not match previous data 2.

甚至认为我一次插入一个值

0 个答案:

没有答案