找出一个句子的意见是积极的还是消极的

时间:2016-03-01 11:52:08

标签: python python-2.7 pos-tagger senti-wordnet

我需要找到网站上给出的某些评论的意见。我正在使用sentiwordnet。我首先将包含所有评论的文件发送到POS Tagger。

tokens=nltk.word_tokenize(line) #tokenization for line in file
tagged=nltk.pos_tag(tokens) #for POSTagging
print tagged

除了将其视为2个单独的单词之外,还有其他任何准确的标记方式,除了将其视为1个单词之外。

现在我必须对标记化的单词给出正面和负面分数,然后计算总分。 sentiwordnet中是否有任何功能。请帮忙。

1 个答案:

答案 0 :(得分:2)

请参阅首先从评论中提取副词和形容词 例如:

import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
import csv

para = "What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid"

sentense = word_tokenize(para)
word_features = []

for i,j in nltk.pos_tag(sentense):
    if j in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']: 
        word_features.append(i)

rating = 0

for i in word_features:
    with open('words.txt', 'rt') as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            if i == row[0]:
                print i, row[1]
                if row[1] == 'pos':
                    rating = rating + 1
                elif row[1] == 'neg':
                    rating = rating - 1
print  rating

现在你必须有一个外部csv文件,你应该在其中有正面和负面的单词

像: 皱纹,NEG 皱纹,NEG 皱纹,NEG 巧妙,POS 杰作,POS 名作,正

使用上述脚本如下:

1。读句子 2。提取副词和形容词 3。与CVS的正面和负面词汇进行比较 4。然后评价句子

上述脚本的结果是:

nice pos  
bad neg  
expensive neg  
sorely neg  
-2

根据您的需要更改结果。 抱歉我的英文:P