假设我有以下包含单词频率的数据框:
Bob Joe Go Eat Run
doc1 2 0 0 1 2
doc2 0 1 1 2 0
我需要生成一个char矢量,如下所示:
chr[1:2] "Bob Bob Eat Run Run"
"Joe Go Eat Eat"
答案 0 :(得分:2)
您可以尝试以下操作:
df <- data.frame(Bob = c(2, 0), Joe = c(0, 1), Go = c(0, 1), Eat = c(1, 2), Run = c(2, 0))
row.names(df) <- c('doc1', 'doc2')
df
Bob Joe Go Eat Run
doc1 2 0 0 1 2
doc2 0 1 1 2 0
apply(df, 1, function(x) paste(rep(names(df), x), collapse = ' '))
doc1 doc2
"Bob Bob Eat Run Run" "Joe Go Eat Eat"
如果你不喜欢这个名字&#39;像上面的矢量,并想要一个直的字符向量,你可以这样做:
as.character(apply(df, 1, function(x) paste(rep(names(df), x), collapse = ' ')))
[1] "Bob Bob Eat Run Run" "Joe Go Eat Eat"
答案 1 :(得分:1)
以下是使用data.table
的选项。转换&#39; data.frame&#39; to&#39; data.table&#39;,按行序列unlist
分组,按照它复制df的列名,
然后paste
它在一起。
library(data.table)
setDT(df)[, toString(rep(names(df), unlist(.SD))) ,1:nrow(df)]$V1
#[1] "Bob, Bob, Eat, Run, Run" "Joe, Go, Eat, Eat"
或使用tapply
base R
tapply(unlist(df), row(df), FUN= function(x)
toString(rep(names(df), x)))