这似乎是一个简单的问题,但我无法弄明白,我试图按照它们的频率对矢量中单词的出现进行排序。
例如我试过
x = 'it was a warm and sunny day it was a great day'
x = unlist(strsplit(x,' '))
unlist(sapply(x,sort,decreasing=T))
然而,这似乎只是按照它们出现的顺序对单词进行排序。 任何帮助将不胜感激
答案 0 :(得分:5)
正如@akrun在评论中指出的那样,您可能需要一个表格:
sort(table(x), decreasing=TRUE)
## x
## a day it was and great sunny warm
## 2 2 2 2 1 1 1 1
或者你可能想要按照频率的顺序得到值的向量:
names(sort(table(x), decreasing=TRUE))
## [1] "a" "day" "it" "was" "and" "great" "sunny" "warm"
或许你想要包含每个原始元素的原始矢量,如下所示:
rep(names(sort(table(x), decreasing=TRUE)), sort(table(x), decreasing=TRUE))
## [1] "a" "a" "day" "day" "it" "it" "was" "was" "and" "great" "sunny" "warm"