无法更新VADER词典

时间:2019-02-22 16:13:52

标签: nlp nltk sentiment-analysis natural-language-processing vader

print(news['title'][5]) 7.5级地震袭击了秘鲁-厄瓜多尔边境地区-印度教

print(analyser.polarity_scores(news['title'][5])) {'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}

from nltk.tokenize import word_tokenize, RegexpTokenizer

import pandas as pd

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer


analyzer = SentimentIntensityAnalyzer()


sentence = news['title'][5]

tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]

for word in tokenized_sentence:
    if (analyzer.polarity_scores(word)['compound']) >= 0.1:
        pos_word_list.append(word)
    elif (analyzer.polarity_scores(word)['compound']) <= -0.1:
        neg_word_list.append(word)
    else:
        neu_word_list.append(word)                

print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list) 
score = analyzer.polarity_scores(sentence)
print('\nScores:', score)

正面:[] 中立:[“幅值”,“ 7.5”,“地震”,“命中”,“秘鲁-厄瓜多尔”,“边界”,“区域”,“-”,“ The”,“印度”] 负数:[]

得分:{'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}

new_words = {
    'Peru-Ecuador': -2.0,
    'quake': -3.4,
}

analyser.lexicon.update(new_words)
print(analyzer.polarity_scores(sentence))

{'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}

from nltk.tokenize import word_tokenize, RegexpTokenizer

import pandas as pd

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer


analyzer = SentimentIntensityAnalyzer()


sentence = news['title'][5]

tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]

for word in tokenized_sentence:
    if (analyzer.polarity_scores(word)['compound']) >= 0.1:
        pos_word_list.append(word)
    elif (analyzer.polarity_scores(word)['compound']) <= -0.1:
        neg_word_list.append(word)
    else:
        neu_word_list.append(word)                

print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list) 
score = analyzer.polarity_scores(sentence)
print('\nScores:', score)

正面:[] 中立:[“幅值”,“ 7.5”,“地震”,“命中”,“秘鲁-厄瓜多尔”,“边界”,“区域”,“-”,“ The”,“印度”] 负数:[]

得分:{'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}

1 个答案:

答案 0 :(得分:0)

您使用的代码绝对正确。更新字典时,您使用的是analyser而不是analyzer(不确定为什么没有错误)。

new_words = {
    'Peru-Ecuador': -2.0,
    'quake': -3.4,
}
​
analyzer.lexicon.update(new_words)
print(analyzer.polarity_scores(sentence))

输出:

{'neg': 0.355, 'neu': 0.645, 'pos': 0.0, 'compound': -0.6597}

更多警告(不确定您是否犯了此错误。) 您不应该再次导入库。因为您更新的单词将消失。 步骤应为:

  1. 导入图书馆和字典
  2. 更新字典(此步骤后,您不应再导入库)
  3. 计算情感分数