我现在正在做情绪分析,我想测试一些分类器的准确性。如果我没有将trainset转换为dict,则错误为"AttributeError: 'tuple' object has no attribute 'iterkeys'"
然而,在我将其转换为dict之后,我收到了错误:
Traceback (most recent call last):
File "E:\Python27\accuracy.py", line 204, in <module>
print 'BernoulliNB`s accuracy is %f' %score(BernoulliNB())
File "E:\Python27\accuracy.py", line 200, in score
classifier.train(trainset)
File "E:\Python27\lib\site-packages\nltk\classify\scikitlearn.py", line 93, in train
for fs, label in labeled_featuresets:
ValueError: too many values to unpack
部分代码:
trainset = extracted_pos_features[50:]+extracted_neg_features[50:]
testset = extracted_pos_features[:50]+extracted_neg_features[:50]
dict1 = {}
for i,j in trainset:
dict1.setdefault(j,[]).append(i)
trainset = dict1
test, tag_test = zip(*testset)
def score(classifier):
classifier = SklearnClassifier(classifier)
classifier.train(trainset)
pred = classifier.batch_classify(test)
return accuracy_score(tag_test, pred)
print 'BernoulliNB`s accuracy is %f' %score(BernoulliNB())
dict1中有两个键&nbsp;&#39; neg&#39;和&#39; pos&#39;分别有多个值:
dict1
{'neg': [('tone', 'ultimately'), ('tragedy', 'core'), ('ultimately', 'dulls'), ('update', 'dreary'), ('version', 'looks'), ('voice', 'lack'), ('worst', 'film'), ('yarn', 'eloquent'), ('makes', 'little'), ('makes', 'maryam'), ('remain', 'true'), ('screen', 'time'), ('sluggish', 'time'), ('thesis', 'makes'), ('time', 'machine'), ('true', 'chan'), ('true', 'original'), ('unashamedly', 'makes'), ('time', 'true')],
'pos': [('rock', 'destined'), ('schwarzenegger', 'van'), ('screenplay', 'curls'), ('segal', 'gorgeously'), ('slice', 'asian'), ('snappy', 'screenplay'), ('somehow', 'pulls'), ('sometimes', 'movies'), ('splash', 'arnold'), ('start', 'emerges'), ('steers', 'snappy'), ('steven', 'segal'), ('top', 'game'), ('trilogy', 'huge'), ('van', 'damme'), ('vision', 'effective'), ('wasabi', 'start'), ('words', 'adequately'), ('cat', 'offers'), ('emerges', 'rare'), ('game', 'offers'), ('offers', 'refreshingly'), ('rare', 'combination'), ('rare', 'issue'), ('offers', 'rare')]}
有谁知道如何修复它? 非常感谢你。
答案 0 :(得分:0)
这是我在使用dict时忘记在列表中使用items()
时所犯的典型错误:
dct = {"aaa": 11, "bbb: 22, "ccc": 33}
for key, val in dct.items():
print "key", key
print "val", val
不使用item
,迭代器将返回密钥本身,并尝试将其用作列表。
在你的情况下,它试图使用key作为字符串作为字符列表,并且因为你的字符串并不总是只有两个字符,它有不同数量的项(chararacters)来解压缩成两个变量{{1} }