我正在查看R中aggregate
函数的帮助页面。我从未使用过这个便利函数,但我有一个过程应该可以帮助我加快速度。但是,我完全无法通过这个例子来理解发生了什么。
以下是一个例子:
1> aggregate(state.x77, list(Region = state.region), mean)
Region Population Income Illiteracy Life Exp Murder HS Grad Frost Area
1 Northeast 5495 4570 1.000 71.26 4.722 53.97 132.78 18141
2 South 4208 4012 1.738 69.71 10.581 44.34 64.62 54605
3 North Central 4803 4611 0.700 71.77 5.275 54.52 138.83 62652
4 West 2915 4703 1.023 71.23 7.215 62.00 102.15 134463
这里的输出正是我所期望的。所以我试着了解发生了什么。所以我看一下state.x77
1> head(state.x77)
Population Income Illiteracy Life Exp Murder HS Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
好的,这对我来说很奇怪。我希望在state.x77中看到一个名为state.region
的列。所以state.region必须是它自己的对象。所以我对它做了str():
1> str(state.region)
Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...
看起来state.region只是一个因素。不知何故,有一个状态是state.region和state.x77之间的连接,以便aggregate()通过state.region对state.x77进行分组。但这种联系对我来说是一个谜。你能帮助我填补我明显的误解吗?
答案 0 :(得分:10)
从一个旧卫生棉条(是卫生棉条?)商业广告:“证明,不仅仅是承诺!”
state.x777 <- as.data.frame(state.x77)
state.x777 <- cbind(state.x777, stejt.ridzn = state.region)
aggregate(state.x77, list(Region = state.x777$stejt.ridzn), mean)
答案 1 :(得分:4)
它们的顺序可能正确,因为这些对象记录在同一帮助页面?state.x77
上,该页面包含:
Details:
R currently contains the following “state” data sets. Note that
all data are arranged according to alphabetical order of the state
names.
答案 2 :(得分:1)
尝试help(state.region)
等 - 它们都是对齐的:
详细说明:
R currently contains the following “state” data sets. Note that all data are arranged according to alphabetical order of the state names.