R Widyr程序包(相关值NaN)

时间:2019-02-03 01:06:22

标签: r tidyr tidytext

我正在分析用户评论中出现的单词的成对相关性,并以相关性网络图的形式绘制它们。

我的示例数据如下:

review_corwords

           Label Rating                word
1            1      1                connect
1.1          1      1                    gps
1.2          1      1                    app
1.3          1      1                connect
1.4          1      1                    gps
1.5          1      1                 matter
1.6          1      1                   long
1.7          1      1                    gps
1.8          1      1                    set
1.9          1      1                   high
1.10         1      1               accuracy
1.11         1      1                setting
1.12         1      1                 appear
1.13         1      1                    set
1.14         1      1                    app
1.15         1      1                useless
1.16         1      1                   cant
1.17         1      1                  track
1.18         1      1                workout
2            1      5                   wish
2.1          1      5                  would
2.2          1      5               interest
2.3          1      5                 google
2.4          1      5                provide
2.5          1      5                 weekly
2.6          1      5                monthly
2.7          1      5                summary
3            1      1                useless

然后我执行此操作:

library(widyr)
# count words co-occuring within a label
word_pairs <- review_corwords %>%
  pairwise_count(word, Label,sort = TRUE)

其输出如下:

# A tibble: 16,333,722 x 3
   item1    item2       n
   <chr>    <chr>   <dbl>
 1 gps      connect     1
 2 app      connect     1
 3 matter   connect     1
 4 long     connect     1
 5 set      connect     1

但是,当我尝试对其进行相关分析时,会得到以下信息:

word_cors <- review_corwords %>%
  group_by(word) %>%
  pairwise_cor(word, Label, sort = TRUE)

# A tibble: 16,333,722 x 3
   item1    item2   correlation
   <chr>    <chr>         <dbl>
 1 gps      connect         NaN
 2 app      connect         NaN
 3 matter   connect         NaN
 4 long     connect         NaN
 5 set      connect         NaN
 6 high     connect         NaN

我需要找到正确的词对相关值,请帮忙。

0 个答案:

没有答案