基于http://tidytextmining.com/sentiment.html#the-sentiments-dataset我试图对某个元素进行情绪分析。
设置tibble:
url <- c( "t1" , "t2")
word <- c( "abnormal" , "good")
n <- c( 1 , 1)
score <- c(1 , 2)
res <- as_tibble(data.frame("url"=url , "word"=word, "n"=n , "score"=score , stringsAsFactors = F))
res
创建:
# A tibble: 2 x 4
url word n score
<chr> <chr> <dbl> <dbl>
1 t1 abnormal 1 1
2 t2 good 1 2
产生情绪:
joined_sentiments <- res %>% inner_join(get_sentiments("bing"))
joined_sentiments
创建:
# A tibble: 2 x 5
url word n score sentiment
<chr> <chr> <dbl> <dbl> <chr>
1 t1 abnormal 1 1 negative
2 t2 good 1 2 positive
如何将这些转换为一系列图表,其中每个图表都是特定网址,类似于
src http://tidytextmining.com/sentiment.html#the-sentiments-dataset
由于没有行号,我正在尝试:
joined_sentiments %>%
count(url, index=n, sentiment) %>%
spread(sentiment, n, fill = 0) %>%
mutate(sentiment = positive - negative)
返回错误:
joined_sentiments %>%
+ count(url, index=n, sentiment) %>%
+ spread(sentiment, n, fill = 0) %>%
+ mutate(sentiment = positive - negative)
Error: `var` must evaluate to a single number or a column name, not a double vector
In addition: Warning message:
In if (!is.finite(x)) return(FALSE) :
the condition has length > 1 and only the first element will be used
答案 0 :(得分:1)
错误/警告的主要原因是&#39; n&#39;列已经存在于数据集中,导致列名更改为&nbsp;&n;默认情况下,count
被应用为count
会创建一个&#39; n&#39;列。
res%&gt;%
inner_join(get_sentiments(&#34; bing&#34;))%&gt;%
count(url,index = n,情绪)
#Joining,by =&#34; word&#34;
#A tibble:2 x 4
#url index sentiment nn
#
#1 t1 1负1
#2 t2 1正1
在后续步骤中,我们spread
广泛&#39;列名称格式为&#39; n&#39;这与“&nbsp”无法匹配。因此,要么将其更改为&#39; nn&#39;
res1 <- res %>%
inner_join(get_sentiments("bing")) %>%
count(url, index=n, sentiment) %>%
spread(sentiment, nn, fill = 0) %>%
mutate(sentiment = positive - negative)
res1
#Joining, by = "word"
# A tibble: 2 x 5
# url index negative positive sentiment
# <chr> <dbl> <dbl> <dbl> <dbl>
#1 t1 1 1 0 -1
#2 t2 1 0 1 1
然后使用ggplot
,我们可以做(使用两行数据,输出可能看起来不太好)
ggplot(res1, aes(index, sentiment, fill = url)) +
geom_col(show.legend = FALSE) +
facet_wrap(~url, ncol = 2, scales = "free_x")
或删除列&#39; n&#39;在创建“res&#39;然后OP的原始代码就可以正常运行
res <- tibble(url , word, score)