R聚合两个变量的数据帧并应用函数

时间:2015-04-26 14:28:44

标签: r aggregate

我有一个数据框,我想在每个测量中应用函数均值来汇总两个变量。这是数据框的负责人:

  Subject Activity         meassureA         meassureB         meassureC       meassureD
1       1  running         0.2820216      -0.037696218       -0.13489730       -0.3282802
2       1  running         0.2558408      -0.064550029       -0.09518634       -0.2292069
3       1  walking         0.2548672       0.003814723       -0.12365809       -0.2751579
4       2  running         0.3433705      -0.014446221       -0.16737697       -0.2299235

现在,我希望得到这样的结果:

  Subject Activity         meassureA         meassureB         meassureC       meassureD
1       1  running         mean(S1,A1)      mean(S1,A1)       mean(S1,A1)       mean(S1,A1)
2       1  walking         mean(S1,A2)      mean(S1,A2)       mean(S1,A2)       mean(S1,A2)
3       2  running         mean(S2,A1)      mean(S2,A1)       mean(S2,A1)       mean(S2,A1)
4       2  walking         mean(S2,A2)      mean(S2,A2)       mean(S2,A2)       mean(S2,A2)

其中meassure A的值是主题1(S1)执行活动(A1)的所有值的平均值。

我在考虑使用aggregate(),但是到目前为止我无法应用我学到的问题。任何帮助都非常感谢。

1 个答案:

答案 0 :(得分:1)

正如大卫在评论中提到的,你可以这样做:

aggregate(. ~ Subject + Activity, df, mean)

或使用data.table

data.table::setDT(df)[, lapply(.SD, mean), by = .(Subject, Activity)]

或使用dplyr

library(dplyr)
df %>% group_by(Subject, Activity) %>% summarise_each(funs(mean))

给出了:

#  Subject Activity meassureA    meassureB  meassureC  meassureD
#1       1  running 0.2689312 -0.051123123 -0.1150418 -0.2787436
#2       1  walking 0.2548672  0.003814723 -0.1236581 -0.2751579
#3       2  running 0.3433705 -0.014446221 -0.1673770 -0.2299235