Question

目前，我正在研究情感分析部分。为此，我更倾向于使用python使用Standford Core NLP库。我可以使用以下代码获得每个句子的情感：从pycorenlp导入StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate("I love you. I hate him. You are nice. He is dumb",
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
for s in res["sentences"]:
    print("%d: '%s': %s %s" % (
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

但是，我的要求是，我有一个文本文件，其中包含约100个句子，并用新行分隔。

因此，我尝试使用以下代码打开文本文件并阅读句子，并找到每个句子的情感。

from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')

with open("/Users/abc/Desktop/test_data.txt","r") as f:
    for line in f.read().split('\n'):
        print("Line:" + line)
        res = nlp.annotate(line,
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
for s in res["sentences"]:
    print("%d: '%s': %s %s" % (
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

但是，文本文件中的所有句子都以某种方式被覆盖，我得到了最后一个句子的情感。因为，我是python的新手，任何人都可以就相同的内容为我提供帮助...

Answer 1

我会给它一个刺，但正如我评论的那样，我没有真正的资格，该代码将未经测试。添加或更改的行标有# <<<<<<。

from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')

results = []     # <<<<<<

with open("/Users/abc/Desktop/test_data.txt","r") as f:
    for line in f.read().split('\n'):
        print("Line:" + line)
        res = nlp.annotate(line,
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
        results.append(res)      # <<<<<<

for res in results:              # <<<<<<
    s = res["sentences"]         # <<<<<<
    print("%d: '%s': %s %s" % (
        s["index"], 
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

我想可以将for line in f.read().split('\n'):替换为更简单的for line in f:，但是我不确定没有看到您的输入文件。

如何遍历文本文件的每一行并使用python获取这些行的情感？

1 个答案: