我有一个整洁的数据框架
> data.frame("topic" = c(1,1,2,2,3,3),
"term" = c("will", "eat", "go", "fun", "good", "bad"),
"score" = c(0.3, 0.2, 0.5, 0.4, 0.1, 0.05))
topic term score
1 1 will 0.30
2 1 eat 0.20
3 2 go 0.50
4 2 fun 0.40
5 3 good 0.10
6 3 bad 0.05
因此,该表的目的是为每个主题存储前n个(在本例中为2个)评分术语。这个表很容易使用,但我希望能够查看这样的数据:
topic1 topic2 topic3
1 will go good
2 eat fun bad
在这张新表中,我并不关心乐谱,我只想看看每个主题的前n个得分术语。我觉得这应该可以使用dplyr
或其他东西,但我对R不太好。
答案 0 :(得分:1)
library(reshape2)
dcast(df, ave(df$topic, df$topic, FUN = seq_along)~topic, value.var = "term")[,-1]
# 1 2 3
#1 will go good
#2 eat fun bad
OR
library(dplyr)
bind_cols(lapply(split(df, df$topic), function(a) a["term"]))
# term term1 term2
#1 will go good
#2 eat fun bad