Question

我正在尝试对Twitter数据进行一些分析。所以我有推文

head(words) 1 "#fabulous" "rock" "is" "#destined" "to" "be" "star"

> head(hashtags)
      hashtags score
1    #fabulous 7.526
2   #excellent 7.247
3      #superb 7.199
4  #perfection 7.099
5    #terrific 6.922
6 #magnificent 6.672

所以我想要检查反对hashtags数据帧和单词字符数组的单词，并且对于每个匹配，我想要得分值的总和。所以在上面的例子中，我希望输出为7.526 + 6.922 = 14.448

非常感谢任何帮助。

Answer 1

试试这个

words_hashtags <- words[grepl('^#', words)]
scores <- hashtags[hashtags$hashtags %in% words_hashtags, 'score']
sum(scores)

grepl返回一个逻辑向量，指示哪些单词在开头有标签。其余的只是基本的R语法。

获取words_hashtags的更多选项：

words_hashtags <- grep('^#', words, value=T)
words_hashtags <- words[grep('^#', words, value=F)]

检查推文得分进行情绪分析

1 个答案: