我正在尝试使用多个功能聚合数据。它的效果很好,但输出不是它所说的。
df<-data.frame(PLTID=rep(paste(letters), 5), R=rnorm(130, 150, 40), G=rnorm(130, 100, 50), B=rnorm(130, 200, 25))
agg.data<-aggregate(data=df, . ~ PLTID, FUN = function(x) c(mean=mean(x, na.rm=TRUE), sd=sd(x, na.rm=TRUE), n=length(x)))
> head(agg.data)
PLTID R.mean R.sd R.n G.mean G.sd G.n B.mean B.sd B.n
1 a 144.29202 28.49934 5.00000 87.85852 76.28156 5.00000 192.230731 29.349837 5.000000
2 b 148.41993 31.37764 5.00000 84.14367 21.59658 5.00000 224.862334 15.769459 5.000000
3 c 158.89111 17.75320 5.00000 114.40942 53.81999 5.00000 205.931132 5.790658 5.000000
4 d 153.70118 34.68649 5.00000 137.97890 54.79668 5.00000 209.587936 21.877864 5.000000
5 e 154.28002 27.33376 5.00000 124.87306 71.09103 5.00000 225.774236 17.720090 5.000000
6 f 144.08148 37.30555 5.00000 57.15275 44.68034 5.00000 204.709050 18.207047 5.000000
这就是我想要的......除非我想稍后访问这些列,否则它们实际上并不存在。
> names(agg.data)
[1] "PLTID" "R" "G" "B"
> str(agg.data)
'data.frame': 26 obs. of 4 variables:
$ PLTID: Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
$ R : num [1:26, 1:3] 144 148 159 154 154 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "sd" "n"
$ G : num [1:26, 1:3] 87.9 84.1 114.4 138 124.9 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "sd" "n"
$ B : num [1:26, 1:3] 192 225 206 210 226 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "mean" "sd" "n"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.8.10 plyr_1.8 gridExtra_0.9.1 rgl_0.93.963 ggplot2_0.9.3.1
loaded via a namespace (and not attached):
[1] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.3 gtable_0.1.2 labeling_0.2 MASS_7.3-29 munsell_0.4.2 proto_0.3-10 RColorBrewer_1.0-5
[10] reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.2
答案 0 :(得分:3)
通常,当我遇到这个问题时,我使用do.call(data.frame, ...)
来“压扁”输出。请参阅以下示例:
agg.data.2 <- do.call(data.frame, agg.data)
head(agg.data.2)
# PLTID R.mean R.sd R.n G.mean G.sd G.n B.mean B.sd B.n
# 1 a 154.5559 33.43817 5 137.17159 26.21461 5 196.7478 28.59095 5
# 2 b 156.8467 32.62164 5 131.35181 48.94969 5 206.3617 17.97154 5
# 3 c 134.1172 21.83323 5 50.35166 71.57083 5 176.2187 38.82059 5
# 4 d 143.5015 43.31296 5 104.60027 103.66002 5 210.0669 18.40418 5
# 5 e 141.8565 27.02871 5 76.82525 47.76355 5 194.3454 30.87080 5
# 6 f 143.3337 12.14870 5 105.95271 42.22400 5 189.3063 12.11930 5
str(agg.data.2)
# 'data.frame': 26 obs. of 10 variables:
# $ PLTID : Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ R.mean: num 155 157 134 144 142 ...
# $ R.sd : num 33.4 32.6 21.8 43.3 27 ...
# $ R.n : num 5 5 5 5 5 5 5 5 5 5 ...
# $ G.mean: num 137.2 131.4 50.4 104.6 76.8 ...
# $ G.sd : num 26.2 48.9 71.6 103.7 47.8 ...
# $ G.n : num 5 5 5 5 5 5 5 5 5 5 ...
# $ B.mean: num 197 206 176 210 194 ...
# $ B.sd : num 28.6 18 38.8 18.4 30.9 ...
# $ B.n : num 5 5 5 5 5 5 5 5 5 5 ...