我有一个包含大量行和列的数据框,其中包含不同的努力计算和30个人做6个不同活动的度量。
我想计算每个人和每项活动的每个变量的平均值,并将其汇总到表格中......
我的解决方案是制作两个循环并继续进行,但没有其他解决方案,更快,继续它...我最近发现了包,dplyr,tidyr,plyr和reshape2,我我想我可以用它来找到解决方案,但我找不到......
你能帮帮我吗? subject id_activity activity tBodyAcc-mean()-X tBodyAcc-mean()-Y tBodyAcc-mean()-Z tGravityAcc-mean()-X tGravityAcc-mean()-Y tGravityAcc-mean()-Z tBodyAccJerk-mean()-X tBodyAccJerk-mean()-Y tBodyAccJerk-mean()-Z
1 1 1 WALKING 0.2885845 -0.020294171 -0.13290514 0.9633961 -0.1408397 0.11537494 0.07799634 0.005000803 -0.0678308080
2 1 1 WALKING 0.2784188 -0.016410568 -0.12352019 0.9665611 -0.1415513 0.10937881 0.07400671 0.005771104 0.0293766330
3 1 1 WALKING 0.2796531 -0.019467156 -0.11346169 0.9668781 -0.1420098 0.10188392 0.07363596 0.003104037 -0.0090456308
4 1 1 WALKING 0.2791739 -0.026200646 -0.12328257 0.9676152 -0.1439765 0.09985014 0.07732061 0.020057642 -0.0098647722
5 1 1 WALKING 0.2766288 -0.016569655 -0.11536185 0.9682244 -0.1487502 0.09448590 0.07344436 0.019121574 0.0167799790
6 1 1 WALKING 0.2771988 -0.010097850 -0.10513725 0.9679482 -0.1482100 0.09190972 0.07793244 0.018684046 0.0093444336
7 1 1 WALKING 0.2794539 -0.019640776 -0.11002215 0.9679295 -0.1442821 0.09314463 0.08217077 -0.017014670 -0.0157981660
8 1 1 WALKING 0.2774325 -0.030488303 -0.12536043 0.9684915 -0.1467054 0.09170816 0.07236423 0.008747856 -0.0044681354
9 1 1 WALKING 0.2772934 -0.021750698 -0.12075082 0.9684812 -0.1543740 0.08511826 0.07528437 0.030762704 0.0112119500
10 1 1 WALKING 0.2805857 -0.009960298 -0.10606516 0.9684180 -0.1563020 0.08087447 0.07636932 0.012518906 0.0030843751
11 1 1 WALKING 0.2768803 -0.012721805 -0.10343832 0.9692027 -0.1523614 0.08125808 0.07139686 0.016842441 0.0010303821
12 1 1 WALKING 0.2762282 -0.021441302 -0.10820234 0.9692533 -0.1500638 0.08293121 0.07608451 -0.002311558 -0.0076736296
13 1 1 WALKING 0.2784570 -0.020414761 -0.11273172 0.9689963 -0.1523621 0.08315080 0.07710200 0.017027167 -0.0009852394
14 1 1 WALKING 0.2771750 -0.014712802 -0.10675647 0.9690440 -0.1541413 0.08181960 0.07761238 0.019489223 0.0152076830
15 1 1 WALKING 0.2979457 0.027093908 -0.06166812 0.9448949 -0.2926233 -0.02143552 0.06665616 -0.068367084 -0.0336076010
有10 299行和56列,我没有把你所有列,只是一个子集,看看它是怎么样的...抱歉我的英文^^
答案 0 :(得分:1)
您可以尝试功能aggregate
。它专为您正在寻找的东西而设计。
xy <- data.frame(subj = c(1,1,1,1,2,2,2,2),
act = c("a", "a", "b", "b", "a", "a", "b", "b"),
stat1 = rnorm(8),
stat2 = rnorm(8),
stat3 = rnorm(8))
xy
aggregate(. ~ subj + act, data = xy, FUN = mean)
subj act stat1 stat2 stat3
1 1 a 0.10244340 0.9175242 -0.1240974
2 2 a 0.06747905 -0.3221609 0.8647476
3 1 b -0.17143146 0.9971627 0.3603535
4 2 b -1.32023632 0.6584811 0.2126244
您还可以使用程序包data.table
,它通常可以比某些基本R解决方案更快地执行操作。
library(data.table)
setDT(xy)
xy[, lapply(.SD, mean), by = .(subj, act)]
subj act stat1 stat2 stat3
1: 1 a 0.10244340 0.9175242 -0.1240974
2: 1 b -0.17143146 0.9971627 0.3603535
3: 2 a 0.06747905 -0.3221609 0.8647476
4: 2 b -1.32023632 0.6584811 0.2126244