我有一个如下所示的数据框:
> testdata
topic1 topic2 topic3 topic4 topic5
church 0.011 0.003 0.001 0.001 0.012
of 0.094 0.085 0.098 0.063 0.051
the 0.143 0.115 0.159 0.083 0.097
appearance 0.000 0.000 0.002 0.005 0.040
restrain 0.000 0.000 0.000 0.000 0.000
我需要做的是创建一个新的数据框,该数据框也是5行乘5列,其中每列是此数据框的有序行名。换句话说,我需要按降序排列每列的数据框,然后在该列的顶部打印行名,主要是为了按顺序获取排序的单词。对于这个例子,我需要的数据框是
> testdata_word_ranks
topic1 topic2 topic3 topic4 topic5
church the the the the the
of of of of of of
the church church appearance appearance appearance
appearance appearance appearance church church church
restrain restrain restrain restrain restrain restrain
以下是我尝试将上面的testdata_word_ranks
列分配到新数据框的失败:
for(i in 1:nrow(testdata)){
minidf = data.frame(rownames(testdata), testdata[,i])
assign(paste0('testdata_word_ranks$topic', i),
as.vector(minidf[order(minidf[,2], decreasing = TRUE),]$rownames.testdata))
}
仅供参考,此数据来自特定语料库的主题模型。
答案 0 :(得分:3)
您可以按每列的顺序索引行名称:
matrix(row.names(test.data)[apply(-test.data, 2, order)], nrow(test.data))
# [,1] [,2] [,3] [,4] [,5]
# [1,] "the" "the" "the" "the" "the"
# [2,] "of" "of" "of" "of" "of"
# [3,] "church" "church" "appearance" "appearance" "appearance"
# [4,] "appearance" "appearance" "church" "church" "church"
# [5,] "restrain" "restrain" "restrain" "restrain" "restrain"