我是python的新手,我有一个看起来像这样的数据集
我正在从数据集中提取评论,并尝试应用VADER工具来检查与每个评论相关的情感权重。我能够成功检索评论,但无法将VADER应用于每个评论。这是代码
import nltk
import requirements_elicitation
from nltk.sentiment.vader import SentimentIntensityAnalyzer
c = requirements_elicitation.read_reviews("D:\\Python\\testml\\my-tracks-reviews.csv")
class SentiFind:
def init__(self,review):
self.review = review
for review in c:
review = review.comment
print(review)
sid = SentimentIntensityAnalyzer()
for i in review:
print(i)
ss = sid.polarity_scores(i)
for k in sorted(ss):
print('{0}: {1}, '.format(k, ss[k]), end='')
print()
示例输出:
g
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
r
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
e
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
a
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
t
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
a
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
p
compound: 0.0, neg: 0.0, neu: 0.0, pos: 0.0,
p
我还需要为每个评论自定义标签
"Total weight: {0}, Negative: {1}, Neutral: {2}, Positive: {3}".
答案 0 :(得分:2)
您定义的review
是string
,因此当您遍历它时,会得到每个字母:
for i in review:
print(i)
g
r
e
a...
因此,您需要分析器进行每次检查:
sid = SentimentIntensityAnalyzer()
for review in c:
review = review.comment
ss = sid.polarity_scores(review)
total_weight = ss.compound
positive = ss.pos
negative = ss.neg
neutral = ss.neu
print("Total weight: {0}, Negative: {1}, Neutral: {2}, Positive: {3}".format(total_weight, positive, negative, neutral))