如何创建垂直排列的价值 - 频率组合表
> df = data.frame(fruit=c("apple", "banana", "cherry", "cherry", "apple", "banana", "apple", "date"));
> table(df$fruit)
apple banana cherry date
3 2 2 1
到目前为止一切顺利。但是我想要像这样的东西(例如基于频率的基本操作和值的子集化):
Fruit Freq
"apple" 3
"banana" 2
"cherry" 2
"date" 1
在SQL中,那将是SELECT fruit, COUNT(*) AS Freq FROM df GROUP BY fruit
,并且会产生一个类似于这个问题起点的表:https://stats.stackexchange.com/questions/15574/how-to-convert-a-frequency-table-into-a-vector-of-values
在 R 中有一种简单的方法吗? (或者,这是否表明心态太'SQL'且不够'R'?)
答案 0 :(得分:6)
data.frame(table(df))
# df Freq
# 1 apple 3
# 2 banana 2
# 3 cherry 2
# 4 date 1
或者
setNames(data.frame(table(df)), c("Fruit", "Freq"))
# Fruit Freq
# 1 apple 3
# 2 banana 2
# 3 cherry 2
# 4 date 1
答案 1 :(得分:4)
只使用基数R,您只需将其转换为矩阵,该矩阵将以列为导向,其值为row.names。
as.matrix(table(df))
答案 2 :(得分:3)
使用reshape2包中的melt
功能
DF <- table(df$fruit)
library(reshape2)
melt(DF)
colnames(result) <- c('Fruit', 'Freq')
result
Fruit Freq
1 apple 3
2 banana 2
3 cherry 2
4 date 1
答案 3 :(得分:0)
如果你想坚持使用SQL思维模式,总会有“sqldf
”包!
df = data.frame(fruit=c("apple", "banana", "cherry", "cherry",
"apple", "banana", "apple", "date"))
library(sqldf)
sqldf("SELECT fruit, COUNT(*) AS Freq FROM df GROUP BY fruit")
# fruit Freq
# 1 apple 3
# 2 banana 2
# 3 cherry 2
# 4 date 1
有!你已经知道了答案;)
答案 4 :(得分:0)
只是添加一个不同的答案:
df = data.frame(fruit=c("apple", "banana", "cherry", "cherry", "apple", "banana", "apple", "date"));
df
fruit
1 apple
2 banana
3 cherry
4 cherry
5 apple
6 banana
7 apple
8 date
freq <- unlist(lapply(unique(df$fruit),function(x) length(which(df$fruit ==x))))
freq
[1] 3 2 2 1
df.new <- data.frame(fruits = unique(df$fruit),freq)
df.new
fruits freq
1 apple 3
2 banana 2
3 cherry 2
4 date 1