Question

我有一个整洁的数据框架

  > data.frame("topic" = c(1,1,2,2,3,3), 
               "term" = c("will", "eat", "go", "fun", "good", "bad"), 
               "score" = c(0.3, 0.2, 0.5, 0.4, 0.1, 0.05))

      topic term score
    1     1 will  0.30
    2     1  eat  0.20
    3     2   go  0.50
    4     2  fun  0.40
    5     3 good  0.10
    6     3  bad  0.05

因此，该表的目的是为每个主题存储前n个（在本例中为2个）评分术语。这个表很容易使用，但我希望能够查看这样的数据：

      topic1  topic2  topic3
   1    will      go    good
   2     eat     fun     bad

在这张新表中，我并不关心乐谱，我只想看看每个主题的前n个得分术语。我觉得这应该可以使用dplyr或其他东西，但我对R不太好。

Answer 1

library(reshape2)
dcast(df, ave(df$topic, df$topic, FUN = seq_along)~topic, value.var = "term")[,-1]
#     1   2    3
#1 will  go good
#2  eat fun  bad

OR

library(dplyr)
bind_cols(lapply(split(df, df$topic), function(a) a["term"]))
#  term term1 term2
#1 will    go  good
#2  eat   fun   bad

R：如何扩展排名单词

1 个答案: