嗨,我在下面编写了代码以进行情感分析:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import time
analyzer = SentimentIntensityAnalyzer()
pos_count = 0
pos_correct = 0
with open('EVG_text mining.txt', mode='rb') as f:
bytes = f.read()
text = bytes.decode('utf-8', 'ignore')
for line in f.read().split('\n'):
vs = analyzer.polarity_scores(line)
if not vs['neg'] > 0.1:
if vs['pos']-vs['neg'] > 0:
pos_correct += 1
pos_count +=1
neg_count = 0
neg_correct = 0
with open('EVG_text mining.txt', mode='rb') as f:
for line in f.read().split('\n'):
vs = analyzer.polarity_scores(line)
if not vs['pos'] > 0.1:
if vs['pos']-vs['neg'] <= 0:
neg_correct += 1
neg_count +=1
print("Positive accuracy = {}% via {} samples".format(pos_correct/pos_count*100.0, pos_count))
print("Negative accuracy = {}% via {} samples".format(neg_correct/neg_count*100.0, neg_count))
但是,我收到一个错误:
Traceback (most recent call last):
File "<ipython-input-9-62462b5174b4>", line 12, in <module>
for line in f.read().split('\n'):
TypeError: a bytes-like object is required, not 'str'
我该如何解决?
答案 0 :(得分:0)
您有一个以二进制模式打开的文件。读取时返回bytes
,而不是str
。
此行:
bytes = f.read()
将整个文件读入名为bytes
的变量(不要那样做,python已经有一个名为bytes
的函数,通过使用此名称,您将成为<< em> shadowing “内置函数)。
然后您继续解码字节:
text = bytes.decode('utf-8', 'ignore')
但是您最终还是再次读取了文件!
for line in f.read().split('\n'):
由于已经读取了文件,因此返回空字节串(b''
)并在其中调用.split()
会导致您看到错误。
我建议您不要事先读取文件,而是以文本模式打开文件,这样就不必解码或分割任何内容,因为数据将逐行传输并已经解码:
with open('EVG_text mining.txt', encoding='utf-8') as f:
for line in f: # lines come already decoded