tidyr / data.table:仅收集/融合值,丢弃键

时间:2017-07-30 19:01:47

标签: r data.table dplyr tidyr

我想收集/融合没有键,值最终输出的数据框。所有值都应该以一列结束。

library(tidyverse)
library(tidytext)

data <- get_sentiments("nrc")

我使用的数据如下所示:     nrc_wide&lt; - dcast(nrc,word~sentiment)

# sample output:

         word anger anticipation disgust fear  joy negative positive 
1      abacus  <NA>         <NA>    <NA> <NA> <NA>     <NA>     <NA>    
2     abandon  <NA>         <NA>    <NA> fear <NA> negative     <NA> 
3   abandoned anger         <NA>    <NA> fear <NA> negative     <NA> 
4 abandonment anger         <NA>    <NA> fear <NA> negative     <NA> 

我希望把它变成原始集合的样子:

     word sentiment
     <chr>     <chr>
1    abacus     trust
2   abandon      fear
3   abandon  negative
4   abandon   sadness

我尝试过很多聚集和融合选项,但这不是关键的价值格式。

1 个答案:

答案 0 :(得分:4)

正常的tidyr::gather操作应该执行,但您需要1)删除na.rm = TRUE的NAs; 2)重塑时排除word列,使其停留; 3)删除键列后:

library(tidyverse)
nrc_wide %>% 
    gather(key, sentiment, -word, na.rm = T) %>% 
    select(-key) %>% 
    arrange(word) %>% 
    head

#       word sentiment
#1    abacus     trust
#2   abandon      fear
#3   abandon  negative
#4   abandon   sadness
#5 abandoned     anger
#6 abandoned      fear

使用data.table::melt

library(data.table)
melt(setDT(nrc_wide), id.vars = "word", na.rm = TRUE)[, 
    .(word, sentiment = value)
][order(word)]

#            word    sentiment
#    1:    abacus        trust
#    2:   abandon         fear
#    3:   abandon     negative
#    4:   abandon      sadness
#    5: abandoned        anger
#   ---                       
#13897:      zest anticipation
#13898:      zest          joy
#13899:      zest     positive
#13900:      zest        trust
#13901:       zip     negative