VaderSentiment错误:TypeError:需要一个类似字节的对象,而不是'str'

时间:2018-07-24 21:24:54

标签: python api sentiment-analysis vader

嗨,我在下面编写了代码以进行情感分析:

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

import time
analyzer = SentimentIntensityAnalyzer()

pos_count = 0
pos_correct = 0

with open('EVG_text mining.txt', mode='rb') as f: 
    bytes = f.read()
    text = bytes.decode('utf-8', 'ignore') 
    for line in f.read().split('\n'):
        vs = analyzer.polarity_scores(line)
        if not vs['neg'] > 0.1:
            if vs['pos']-vs['neg'] > 0:
                pos_correct += 1
            pos_count +=1


neg_count = 0
neg_correct = 0

with open('EVG_text mining.txt', mode='rb') as f: 
    for line in f.read().split('\n'):
        vs = analyzer.polarity_scores(line)
        if not vs['pos'] > 0.1:
            if vs['pos']-vs['neg'] <= 0:
                neg_correct += 1
            neg_count +=1

print("Positive accuracy = {}% via {} samples".format(pos_correct/pos_count*100.0, pos_count))
print("Negative accuracy = {}% via {} samples".format(neg_correct/neg_count*100.0, neg_count))

但是,我收到一个错误:

Traceback (most recent call last):
  File "<ipython-input-9-62462b5174b4>", line 12, in <module>
    for line in f.read().split('\n'):
TypeError: a bytes-like object is required, not 'str'

我该如何解决?

1 个答案:

答案 0 :(得分:0)

您有一个以二进制模式打开的文件。读取时返回bytes,而不是str

此行:

bytes = f.read()

将整个文件读入名为bytes的变量(不要那样做,python已经有一个名为bytes的函数,通过使用此名称,您将成为<< em> shadowing “内置函数)。

然后您继续解码字节:

text = bytes.decode('utf-8', 'ignore') 

但是您最终还是再次读取了文件!

for line in f.read().split('\n'):

由于已经读取了文件,因此返回空字节串(b'')并在其中调用.split()会导致您看到错误。

我建议您不要事先读取文件,而是以文本模式打开文件,这样就不必解码或分割任何内容,因为数据将逐行传输并已经解码:

with open('EVG_text mining.txt', encoding='utf-8') as f: 
    for line in f: # lines come already decoded