所以我是R代码初学者。在我看来,有一种快速而肮脏的方法来计算列中一组n行的平均值,但是对于标准偏差(或标准误差)是否有类似的东西?如果可能的话,我想避免循环,因为这只是我正在构建的越来越笨重(对于初学者)代码的一小部分。 以下是我将要使用的数据集的简化示例:
Canopy Species Date Pa
1 Maple BETH 4/26/2014 -0.1162607263
2 Maple BETH 4/26/2014 -0.2742194706
3 Maple BETH 4/26/2014 -0.1864006372
4 Maple BETH 4/26/2014 -0.0739905518
5 Maple BETH 4/26/2014 -0.0751169983
6 Maple BETH 4/26/2014 -0.0782771938
7 Maple BETH 4/26/2014 -0.1671646757
8 Maple BETH 4/26/2014 -0.2464696338
9 Maple BETH 4/26/2014 -0.2176720386
10 Maple BETH 4/26/2014 -0.2283216397
11 Maple BETH 4/26/2014 -0.1152989165
12 Maple BETH 4/26/2014 -0.2720884764
13 Maple BETH 4/26/2014 -0.1849383730
14 Maple BETH 4/26/2014 -0.0734205199
15 Maple BETH 4/26/2014 -0.0745294634
16 Maple BETH 4/26/2014 -0.0776640601
17 Maple BETH 4/26/2014 -0.1658603785
18 Maple BETH 4/26/2014 -0.2445047320
19 Maple BETH 4/26/2014 -0.2159337593
20 Maple BETH 4/26/2014 -0.2264833266
这是一个我所指的代码的示例代码。这个找到Pa列中每10行的平均值:
mu<-colMeans(matrix(Table$Pa, nrow=10))
提前感谢您的帮助,如果我有任何其他信息,请告诉我。
答案 0 :(得分:1)
您也可以使用by
> n<-nrow(Table)
> index<-ceiling((1:n)/10)
> by(Table$Pa,index,mean)
index: 1
[1] -0.1663894
------------------------------------------------------------
index: 2
[1] -0.1650722
> by(Table$Pa,index,sd)
index: 1
[1] 0.07604938
------------------------------------------------------------
index: 2
[1] 0.07544763
编辑:您可以将这些放在表格中,例如:
>cbind(index=unique(index),mean=by(Table$Pa,index,mean),sd=by(Table$Pa,index,sd))
index mean sd
1 1 -0.1663894 0.07604938
2 2 -0.1650722 0.07544763
答案 1 :(得分:0)
这是一个混合基础R / dplyr解决方案:首先,我创建了一个名为fac_to_spli的列,它是用于计算标准偏差的因子,然后使用dplyr的group_by和mutate进行计算。
library(dplyr)
df$fac_to_spli <- sort(rep(seq(from = 1, to = nrow(df), by = 10), nrow(df) / 2 ))
df %>% group_by(fac_to_spli) %>% mutate(stand_dev = sd(Pa))
Source: local data frame [20 x 6]
Groups: fac_to_spli [2]
Canopy Species Date Pa fac_to_spli stand_dev
(fctr) (fctr) (fctr) (dbl) (dbl) (dbl)
1 Maple BETH 4/26/2014 -0.11626073 1 0.07604938
2 Maple BETH 4/26/2014 -0.27421947 1 0.07604938
3 Maple BETH 4/26/2014 -0.18640064 1 0.07604938
4 Maple BETH 4/26/2014 -0.07399055 1 0.07604938
5 Maple BETH 4/26/2014 -0.07511700 1 0.07604938
6 Maple BETH 4/26/2014 -0.07827719 1 0.07604938
7 Maple BETH 4/26/2014 -0.16716468 1 0.07604938
8 Maple BETH 4/26/2014 -0.24646963 1 0.07604938
9 Maple BETH 4/26/2014 -0.21767204 1 0.07604938
10 Maple BETH 4/26/2014 -0.22832164 1 0.07604938
11 Maple BETH 4/26/2014 -0.11529892 11 0.07544763
12 Maple BETH 4/26/2014 -0.27208848 11 0.07544763
13 Maple BETH 4/26/2014 -0.18493837 11 0.07544763
14 Maple BETH 4/26/2014 -0.07342052 11 0.07544763
15 Maple BETH 4/26/2014 -0.07452946 11 0.07544763
16 Maple BETH 4/26/2014 -0.07766406 11 0.07544763
17 Maple BETH 4/26/2014 -0.16586038 11 0.07544763
18 Maple BETH 4/26/2014 -0.24450473 11 0.07544763
19 Maple BETH 4/26/2014 -0.21593376 11 0.07544763
20 Maple BETH 4/26/2014 -0.22648333 11 0.07544763
答案 2 :(得分:0)
@rawr使用dplyr-package说的是什么:
df %>%
mutate(id=round(row_number()/10)) %>%
group_by(id) %>%
summarize(mean=mean(Pa),sd=sd(Pa))
id mean sd
(dbl) (dbl) (dbl)
1 0 52.00000 67.97058
2 1 32.22222 18.55921
3 2 44.54545 36.70521
4 3 23.33333 25.49510
5 4 24.54545 18.63525
6 5 58.88889 78.96905
7 6 52.72727 89.89893
8 7 31.11111 26.19372
9 8 24.54545 18.09068
10 9 50.00000 64.42049