我对这个问题感到非常难过。我对python和NLTK比较陌生。我正在尝试制作一个朴素的贝叶斯分类器,我不确定输入应该是元组列表,还是字典或列表,它是两个列表的元组。
以下内容返回AttributeError: 'str' object has no attribute 'items'
[('maggie: just a push button. and the electric car uses sensors to drive itself. \n', 'notending')]
以下格式会返回以下错误AttributeError: 'list' object has no attribute 'items'
[([['the', 'fire', 'chief', 'says', 'someone', 'started', 'the', 'blaze', 'on', 'purpose', 'as', 'a', 'controlled', 'burn', ',', 'but', 'it', 'quickly', 'got', 'out', 'of', 'hand', '.']], 'notending')]
如果我使用字典,我会收到以下错误ValueError: too many values to unpack
{'everyone: bye!': 'ending'}
我将朴素贝叶斯分类器称为classifier = nltk.NaiveBayesClassifier.train(d_train)
我不确定这里有什么问题。非常感谢您的帮助。感谢。
答案 0 :(得分:6)
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import stopwords
stopset = list(set(stopwords.words('english')))
def word_feats(words):
return dict([(word, True) for word in words.split() if word not in stopset])
posids = ['I love this sandwich.', 'I feel very good about these beers.']
negids = ['I hate this sandwich.', 'I feel worst about these beers.']
pos_feats = [(word_feats(f), 'positive') for f in posids ]
neg_feats = [(word_feats(f), 'negative') for f in negids ]
print pos_feats
print neg_feats
trainfeats = pos_feats + neg_feats
classifier = NaiveBayesClassifier.train(trainfeats)
看看正面和负面的壮举
[({'I': True, 'love': True, 'sandwich.': True}, 'positive'), ({'I': True, 'feel': True, 'good': True, 'beers.': True}, 'positive')]
[({'I': True, 'hate': True, 'sandwich.': True}, 'negative'), ({'I': True, 'feel': True, 'beers.': True, 'worst': True}, 'negative')]
所以,如果你给出句子'我讨厌一切'分类
print classifier.classify(word_feats('I hate everything'))
您将获得结果为'否定'。