我需要重塑和总结一个数据框。我已经确定reshape2(也许是dplyr)是最有可能完成这项工作的方案,但我提出的唯一方法是如此低效和乏味,以至于它无法在这里展示。这是一个示例数据集;真正的变量有更多的变量来总结更多的聚合函数:
> d <- data.frame(type=sample(c("dog","goat", "pika"), 50, replace=TRUE), a=rnorm(50, 50,8), b=rnorm(50,70,4))
> head(d)
type a b
1 dog 49.29015 73.09723
2 dog 35.16051 72.44976
3 dog 58.37524 66.41876
4 goat 66.05670 64.05190
5 goat 51.45586 69.84018
6 goat 63.10084 69.70595
我试图将它变成这样的形状:
type variable mean sd
1 dog a 50 8
2 dog b 70 4
3 goat a 50 8
4 goat b 70 4
5 pika a 50 8
6 pika b 70 4
答案 0 :(得分:3)
您可以将dplyr
与tidyr
library(dplyr)
library(tidyr)
d %>%
gather(variable, val, a:b) %>%
group_by(type, variable) %>%
summarise(Mean=mean(val, na.rm=TRUE), Sd=sd(val, na.rm=TRUE))
给出结果(它是不同的,因为示例没有使用set.seed
# type variable Mean Sd
#1 dog a 45.72271 7.304119
#2 dog b 72.16658 5.562985
#3 goat a 48.10097 6.856664
#4 goat b 70.16296 4.014350
#5 pika a 52.88040 6.434812
#6 pika b 68.70830 4.343295
答案 1 :(得分:3)
这使用reshape2
和dplyr
library(reshape2)
library(dplyr)
summarize(group_by(melt(d), type, variable), mean=mean(value), sd=sd(value))
# Source: local data frame [6 x 4]
# Groups: type
# type variable mean sd
# 1 dog a 47.42249 10.669676
# 2 dog b 68.92475 3.659657
# 3 goat a 52.41433 7.181254
# 4 goat b 70.28015 3.815483
# 5 pika a 51.78442 8.513349
# 6 pika b 71.10006 4.445932