NaiveBayes分类器

时间:2018-01-24 22:58:40

标签: python classification

我遇到了朴素贝叶斯分类器的问题,我试图分析一些句子,但我在python中有一些错误。

from naiveBayesClassifier.trainedData import TrainedData

class Trainer(object):
"""docstring for Trainer"""
def __init__(self, tokenizer):
    super(Trainer, self).__init__()
    self.tokenizer = tokenizer
    self.data = TrainedData()

def train(self, text, className):
    """
    enhances trained data using the given text and class
    """
    self.data.increaseClass(className)

    tokens = self.tokenizer.tokenize(text)
    for token in tokens:
        token = self.tokenizer.remove_stop_words(token)
        token = self.tokenizer.remove_punctuation(token)
        self.data.increaseToken(token, className)

控制台中的错误:有谁知道如何解决问题?感谢

 tokens = self.tokenizer.tokenize(text)
AttributeError: module 'naiveBayesClassifier.tokenizer' has no attribute 'tokenize'

这是主要课程:

   from naiveBayesClassifier import tokenizer
from naiveBayesClassifier.trainer import Trainer
from naiveBayesClassifier.classifier import Classifier


postTrainer = Trainer(tokenizer)

postsSet = [
    {'text': 'not to eat too much is not enough to lose weight', 'category': 'health'},
    {'text': 'Russia try to invade Ukraine', 'category': 'politics'},
    {'text': 'do not neglect exercise', 'category': 'health'},
    {'text': 'Syria is the main issue, Obama says', 'category': 'politics'}
]

for post in postsSet:
    postTrainer.train(post['text'], post['category'])

postClassifier = Classifier(postTrainer.data, tokenizer)

classification = postClassifier.classify("Obama is")
print(classification)

0 个答案:

没有答案