Question

基于http://tidytextmining.com/sentiment.html#the-sentiments-dataset我试图对某个元素进行情绪分析。

设置tibble：

url <- c( "t1" , "t2")
word <- c( "abnormal" , "good")
n <- c( 1 , 1)
score <- c(1 , 2)
res <- as_tibble(data.frame("url"=url , "word"=word, "n"=n , "score"=score , stringsAsFactors = F))
res

创建：

# A tibble: 2 x 4
    url     word     n score
  <chr>    <chr> <dbl> <dbl>
1    t1 abnormal     1     1
2    t2     good     1     2

产生情绪：

joined_sentiments <- res %>% inner_join(get_sentiments("bing"))
joined_sentiments

创建：

# A tibble: 2 x 5
    url     word     n score sentiment
  <chr>    <chr> <dbl> <dbl>     <chr>
1    t1 abnormal     1     1  negative
2    t2     good     1     2  positive

如何将这些转换为一系列图表，其中每个图表都是特定网址，类似于

src http://tidytextmining.com/sentiment.html#the-sentiments-dataset

由于没有行号，我正在尝试：

joined_sentiments %>%
  count(url, index=n, sentiment) %>%
  spread(sentiment, n, fill = 0) %>%
  mutate(sentiment = positive - negative)

返回错误：

joined_sentiments %>%
+   count(url, index=n, sentiment) %>%
+   spread(sentiment, n, fill = 0) %>%
+   mutate(sentiment = positive - negative)
Error: `var` must evaluate to a single number or a column name, not a double vector
In addition: Warning message:
In if (!is.finite(x)) return(FALSE) :
  the condition has length > 1 and only the first element will be used

Answer 1

错误/警告的主要原因是＆＃39; n＆＃39;列已经存在于数据集中，导致列名更改为＆nbsp;＆n;默认情况下，count被应用为count会创建一个＆＃39; n＆＃39;列。

res％＆gt;％ inner_join（get_sentiments（＆＃34; bing＆＃34;））％＆gt;％ count（url，index = n，情绪）
#Joining，by =＆＃34; word＆＃34; #A tibble：2 x 4 ＃url index sentiment nn ＃＃1 t1 1负1 ＃2 t2 1正1

在后续步骤中，我们spread广泛＆＃39;列名称格式为＆＃39; n＆＃39;这与“＆nbsp”无法匹配。因此，要么将其更改为＆＃39; nn＆＃39;

res1 <- res %>% 
          inner_join(get_sentiments("bing")) %>% 
          count(url, index=n, sentiment) %>% 
          spread(sentiment, nn, fill = 0) %>%
          mutate(sentiment = positive - negative)
res1
#Joining, by = "word"
# A tibble: 2 x 5
#     url index negative positive sentiment
#   <chr> <dbl>    <dbl>    <dbl>     <dbl>
#1    t1     1        1        0        -1
#2    t2     1        0        1         1

然后使用ggplot，我们可以做（使用两行数据，输出可能看起来不太好）

ggplot(res1, aes(index, sentiment, fill = url)) +
     geom_col(show.legend = FALSE) +
     facet_wrap(~url, ncol = 2, scales = "free_x")

或删除列＆＃39; n＆＃39;在创建“res＆＃39;然后OP的原始代码就可以正常运行

res <- tibble(url , word, score)

对一个单位的情感分析

1 个答案: