手动计算方差分析

时间:2013-12-11 22:03:29

标签: r anova

我正在尝试使用以下代码“手动”计算SSB和SSW:

> data.soy      
 protein amount
1   cereal      5
2   cereal     17
3   cereal     12
4   cereal     10
5   cereal      4
6   energy     19
7   energy     10
8   energy      9
9   energy      7
10  energy      5
11  veggie     25
12 veggie      15
13  veggie     12
14  veggie      9
15  veggie      8

> soy.fit<-lm(amount~protein, data=data.soy)
> anova(soy.fit)

Analysis of Variance Table

Response: amount
          Df Sum Sq Mean Sq F value Pr(>F)
protein    3  55.53  18.511  0.4916 0.6953
Residuals 11 414.20  37.655    
> n1<-length(cereal)
> n1
[1] 5
> n2<-length(energy)
> n2
[1] 5
> n3<-length(veggy)
> n3
[1] 5
> m<-mean((cereal+energy)+(cereal+veggy))
> m
[1] 43
> s<-sd((cereal+energy)+(cereal+veggy))
> s
[1] 15.11622
> m1<-mean(cereal)
> m1
[1] 9.6
> m2<-mean(energy)

> m2
[1] 10
> m3<-mean(veggy)
> m3
[1] 13.8
> overallm<-(((n1*m1)+(n2*m2)+(n3*m3))/(n1+n2+n3))
> overallm
[1] 11.13333
> SSB<-((n1*(m1-overallm)^2)+(n2*(m2-overallm)^2)+(n3*(m3-overallm)^2))
> SSB    
[1] 53.73333

但是当我使用anova()函数检查我的答案时,它说55.53。我做错了什么,或者它是一个四舍五入的问题还是什么?

1 个答案:

答案 0 :(得分:3)

有趣的是,

anova不是R中ANOVA的最简单或最直观的功能。您想要aov,或者更有可能是oneway.test

使用oneway.test(默认情况下不假设等方差):

> oneway.test(amount ~ protein, data=data.soy)

        One-way analysis of means (not assuming equal variances)

data:  amount and protein
F = 0.6117, num df = 2.000, denom df = 7.913, p-value = 0.5662

使用aov

> aov(amount ~ protein, data=data.soy)
Call:
   aov(formula = amount ~ protein, data = data.soy)

Terms:
                 protein Residuals
Sum of Squares   53.7333  416.0000
Deg. of Freedom        2        12

Residual standard error: 5.887841
Estimated effects may be unbalanced

> summary(aov(amount ~ protein, data=data.soy))
            Df Sum Sq Mean Sq F value Pr(>F)
protein      2   53.7   26.87   0.775  0.482
Residuals   12  416.0   34.67

这与在var.equal=TRUE中设置oneway.test相同:

> oneway.test(amount ~ protein, data=data.soy, var.equal=TRUE)

        One-way analysis of means

data:  amount and protein
F = 0.775, num df = 2, denom df = 12, p-value = 0.4824

或者像对待lmanova一样:

> anova(lm(amount ~ protein, data=data.soy))
Analysis of Variance Table

Response: amount
          Df Sum Sq Mean Sq F value Pr(>F)
protein    2  53.73  26.867   0.775 0.4824
Residuals 12 416.00  34.667