Question

我是文本挖掘的新手。我有技术人员完成的工作用户评论数据集。

以下是数据集行。

id    comments
1     That's not good job as it was not completed on mentioned time.
2     nice job
3    good job but not satisfied

现在我在这里没有任何标签，我想进行情绪分析（找到极性分数并将评论分为正，负和中立类）。

到目前为止，我所做的是我从谷歌那里得到了带有正面词和负面词的文本文件。然后我使用这些文本文件查看每个注释，如果存在单词，则分配相应的标签。但我没有得到预期的价值。所有人都将标签视为正面。

还有一个问题是，在第一个评论中，有一个名为＆＃34;不好＆＃34; 这个词应该是否定的，但在我的正文本文件中，如果找到的话是好然后它将标签指定为正面，这是错误的。

以下是我的代码：

pos_words_list = [w.lower() for w in pos_words_list]
neg_words_list = [w.lower() for w in neg_words_list]

def assign_comments_labels(x):
    try:
        if any(w in x for w in pos_words_list):
            return 'positive'
        elif any(w in x for w in neg_words_list):
            return 'negative'
        else:
            return 'neutral'
    except:
        return 'neutral'



df['COMMENTS'] = df['COMMENTS'].str.lower()

df['labels'] = df['COMMENTS'].apply(lambda x: assign_comments_labels(x))

df[['COMMENTS','labels']].head()

输出：

  id    comments                                                    labels
    1     That not good job was not completed on mentioned time.      positive
    2     nice job                                                    negative
    3     good job but not satisfied                                  neutral

有谁能请告诉我如何实现这一目标。分配标签和执行情绪分析的正确方法是什么。我还可以做些什么来进行更多探索，从文本数据中获得有意义的见解？

Python使用正面和负面文本文件为文本数据分配标签以进行情感分析（文本分析/挖掘）？

0 个答案: