目前,我正在评估一个带有重复测量设计的问题。我想对方差进行分析。但是,我遇到了适合该模型的问题。 为什么我将因子模型化为有所不同 多个变量(每个代表一个因子),或 在一个变量中(每个因子由变量的值表示)?
为了说明我的问题,我选择了以下示例(改编自this tuorial):
### Data with repeated measures
groceries <- data.frame(
c("lettuce","potatoes","milk","eggs","bread","cereal",
"ground.beef","tomato.soup","laundry.detergent","aspirin"),
c(1.17,1.77,1.49,0.65,1.58,3.13,2.09,0.62,5.89,4.46),
c(1.78,1.98,1.69,0.99,1.70,3.15,1.88,0.65,5.99,4.84),
c(1.29,1.99,1.79,0.69,1.89,2.99,2.09,0.65,5.99,4.99),
c(1.29,1.99,1.59,1.09,1.89,3.09,2.49,0.69,6.99,5.15))
colnames(groceries) <- c("subject","storeA","storeB","storeC","storeD")
### Rearranging data for ANOVA (r); 'store' is the variable with 4 factors
groceries <- cbind(stack(groceries[,2:5]), rep(groceries$subject,4))
colnames(groceries) <- c("price", "store", "subject")
### Adding 4 columns for each factor -> thus, 4 boolean variables
factorLookup <- as.character(unique(groceries$store))
groceries <- cbind(groceries,
sapply(1 : length(factorLookup),
function(x) groceries$store == factorLookup[x]))
names(groceries)[4:7] <- factorLookup
### ANOVA using variable 'store' as factor
aov.1var <- aov(price ~ store + Error(subject/store), data=groceries)
summary(aov.1var)
### ANOVA using 4 factor variables
aov.4fact <- aov(price ~ storeA * storeB * storeC * storeD
+ Error(subject/(storeA * storeB * storeC * storeD)), data=groceries)
summary(aov.4fact)
这些是结果,两种方法显然不相同。此外,他们还提出了多变量方法的警告:
> ### ANOVA using variable 'store' as factor
> aov.1var <- aov(price ~ store + Error(subject/store), data=groceries)
> summary(aov.1var)
Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 9 115.2 12.8
Error: subject:store
Df Sum Sq Mean Sq F value Pr(>F)
store 3 0.5859 0.19529 4.344 0.0127 *
Residuals 27 1.2137 0.04495
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> ### ANOVA using 4 factor variables
> aov.4fact <- aov(price ~ storeA * storeB * storeC * storeD
+ + Error(subject/(storeA * storeB * storeC * storeD)), data=groceries)
Warning message:
In aov(price ~ storeA * storeB * storeC * storeD + Error(subject/(storeA * :
Error() Modell ist singulär
> summary(aov.4fact)
Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 9 115.2 12.8
Error: subject:storeA
Df Sum Sq Mean Sq F value Pr(>F)
storeA 1 0.3763 0.3763 16.02 0.0031 **
Residuals 9 0.2115 0.0235
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subject:storeB
Df Sum Sq Mean Sq F value Pr(>F)
storeB 1 0.0290 0.02904 0.54 0.481
Residuals 9 0.4842 0.05380
Error: subject:storeC
Df Sum Sq Mean Sq F value Pr(>F)
storeC 1 0.1805 0.18050 3.135 0.11
Residuals 9 0.5181 0.05757
让我感到困惑:两种方法都会使用线性混合模型(例如lme {nlme})得到相同的结果。为什么不适用于ANOVA?
也许我对ANOVA的一般理解是错误的。为什么使用单变量方法没有为每个因子输出任何F值?!
任何帮助都很受欢迎!