带有aov()的GNU R ANOVA:作为因子的多个变量与具有多个因子的一个变量

时间:2014-10-10 21:34:52

标签: r statistics anova

目前,我正在评估一个带有重复测量设计的问题。我想对方差进行分析。但是,我遇到了适合该模型的问题。 为什么我将因子模型化为有所不同 多个变量(每个代表一个因子),或 在一个变量中(每个因子由变量的值表示)?

为了说明我的问题,我选择了以下示例(改编自this tuorial):

### Data with repeated measures
groceries <- data.frame(
  c("lettuce","potatoes","milk","eggs","bread","cereal",
    "ground.beef","tomato.soup","laundry.detergent","aspirin"),
  c(1.17,1.77,1.49,0.65,1.58,3.13,2.09,0.62,5.89,4.46),
  c(1.78,1.98,1.69,0.99,1.70,3.15,1.88,0.65,5.99,4.84),
  c(1.29,1.99,1.79,0.69,1.89,2.99,2.09,0.65,5.99,4.99),
  c(1.29,1.99,1.59,1.09,1.89,3.09,2.49,0.69,6.99,5.15))
colnames(groceries) <- c("subject","storeA","storeB","storeC","storeD")

### Rearranging data for ANOVA (r); 'store' is the variable with 4 factors
groceries <- cbind(stack(groceries[,2:5]), rep(groceries$subject,4))
colnames(groceries) <- c("price", "store", "subject")

### Adding 4 columns for each factor -> thus, 4 boolean variables  
factorLookup <- as.character(unique(groceries$store))
groceries <- cbind(groceries, 
                    sapply(1 : length(factorLookup), 
                           function(x) groceries$store ==  factorLookup[x]))
names(groceries)[4:7] <- factorLookup

### ANOVA using variable 'store' as factor
aov.1var <- aov(price ~ store + Error(subject/store), data=groceries)
summary(aov.1var)

### ANOVA using 4 factor variables
aov.4fact <- aov(price ~ storeA * storeB * storeC * storeD 
               + Error(subject/(storeA * storeB * storeC * storeD)), data=groceries)
summary(aov.4fact)

这些是结果,两种方法显然不相同。此外,他们还提出了多变量方法的警告:

> ### ANOVA using variable 'store' as factor
> aov.1var <- aov(price ~ store + Error(subject/store), data=groceries)
> summary(aov.1var)

Error: subject
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  9  115.2    12.8               

Error: subject:store
          Df Sum Sq Mean Sq F value Pr(>F)  
store      3 0.5859 0.19529   4.344 0.0127 *
Residuals 27 1.2137 0.04495                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> ### ANOVA using 4 factor variables
> aov.4fact <- aov(price ~ storeA * storeB * storeC * storeD 
+                + Error(subject/(storeA * storeB * storeC * storeD)), data=groceries)
Warning message:
In aov(price ~ storeA * storeB * storeC * storeD + Error(subject/(storeA *  :
  Error() Modell ist singulär
> summary(aov.4fact)

Error: subject
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  9  115.2    12.8               

Error: subject:storeA
          Df Sum Sq Mean Sq F value Pr(>F)   
storeA     1 0.3763  0.3763   16.02 0.0031 **
Residuals  9 0.2115  0.0235                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subject:storeB
          Df Sum Sq Mean Sq F value Pr(>F)
storeB     1 0.0290 0.02904    0.54  0.481
Residuals  9 0.4842 0.05380               

Error: subject:storeC
          Df Sum Sq Mean Sq F value Pr(>F)
storeC     1 0.1805 0.18050   3.135   0.11
Residuals  9 0.5181 0.05757  

让我感到困惑:两种方法都会使用线性混合模型(例如lme {nlme})得到相同的结果。为什么不适用于ANOVA?

也许我对ANOVA的一般理解是错误的。为什么使用单变量方法没有为每个因子输出任何F值?!

任何帮助都很受欢迎!

0 个答案:

没有答案