我使用以下设置运行嵌套ANOVA:2个区域,一个是参考,一个是曝光(列名为CI = Control / Impact)。两个时间段(影响前后,名为BA的列),前一年为1年,后期为3年。这些年是嵌套的。
我的问题是:如果我使用原始年份(在玩具数据集的Time2列中),我会得到一个结果。如果我重命名这些年份,那么他们只有1对于Before而1-3对于After,我会得到不同的结果。
问题:
玩具数据集:
toy <- structure(list(BA = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("A", "B"), class = "factor"), Time = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"),
Time2 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("11", "12", "13", "15", "16", "17"), class = "factor"),
Lake = c("Area 1", "Area 1", "Area 1", "Area 1", "Area 1",
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2", "Area 1",
"Area 1", "Area 1", "Area 1", "Area 1", "Area 2", "Area 2",
"Area 2", "Area 2", "Area 2", "Area 1", "Area 1", "Area 1",
"Area 1", "Area 1", "Area 2", "Area 2", "Area 2", "Area 2",
"Area 2", "Area 1", "Area 1", "Area 1", "Area 1", "Area 1",
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2"), CI = structure(c(2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"),
Response = c(78.3, 75.3, 69.4, 75.1, 71.1, 49.7, 61, 59.6,
35.3, 26.5, 80.9, 81.4, 67.6, 73.6, 73, 46.4, 73.6, 67.1,
34, 45.5, 86.6, 78, 68.2, 76.8, 69.6, 52.1, 61.9, 50.8, 39.2,
49.6, 72, 74, 71, 68, 58, 40, 41, 34, 54, 61)), .Names = c("BA",
"Time", "Time2", "Lake", "CI", "Response"), row.names = c(NA,
40L), class = "data.frame")
使用1型SS进行分析:
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
# results are the same
anova(mod)
anova(mod1)
现在尝试使用类型2
library(car)
options(contrasts=c("contr.sum", "contr.poly"))
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
Anova(mod, type = "II", singular.ok = TRUE)
Anova(mod1, type = "II", singular.ok = TRUE)
并输入3
Anova(mod, type = "III", singular.ok = TRUE)
Anova(mod1, type = "III", singular.ok = TRUE)
答案 0 :(得分:0)
一开始就警告,这可能并不完全正确 - 此时我对ANOVA感到生气。
在Type-III SS分析中,我们说主要影响是由相互作用限定的。简而言之,这意味着在存在更高阶的相互作用时,较低阶的相互作用和主效应的解释性较差。传统将我们推向了III型分析,但出于这个原因,它们真的很痛苦。总之...
让我们快速浏览您的重新编码。您的重新编码的效果是Time2
从具有四个不同的值变为在Time
中具有三个不同的值。您可能会保留一些可解释性,因为B级BA
的组合对于先前为13的时间值是唯一的,但现在为1。
让我们回到您的数据。现在,BA:Time2带有与BA:CI相同的所有信息。这在Time2 ANOVA结果中看起来如何...
> Anova(mod1, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)
Response: Response
Sum Sq Df F value Pr(>F)
(Intercept) 73945 1 712.5963 < 2.2e-16 ***
BA 246 1 2.3678 0.1337
CI 2484 1 23.9401 2.713e-05 ***
BA:CI 0 1 0.0046 0.9462
BA:Time2 95 2 0.4570 0.6372
BA:CI:Time2 37 2 0.1797 0.8364
Residuals 3321 32
...对于BA的SS:CI正如预期的那样(或多或少)0。与时间模型对比...
> Anova(mod, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)
Response: Response
Sum Sq Df F value Pr(>F)
(Intercept) 107772 1 1038.5835 < 2.2e-16 ***
BA 209 1 2.0099 0.1659
CI 4220 1 40.6655 3.661e-07 ***
BA:CI 9 1 0.0907 0.7653
BA:Time 95 2 0.4570 0.6372
BA:CI:Time 37 2 0.1797 0.8364
Residuals 3321 32
BA:CI得到了一些变化......模型的其余部分似乎也做得更好。
我的感觉是,在所有编码方案下,ANOVA都有一个指定不当的数据集。在两种编码方式下,AB的B级与您的时间组的一个级别混淆。还要特别注意包装中的评论:汽车用于Anova,特别是关于singular.ok争论。简而言之,它表示'对于II型测试,默认为TRUE,对于III型测试,默认为FALSE(对于具有混叠系数的模型的测试不会直接解释)'......在这里,你看似混叠系数。 / p>
...从未明白福克斯对他在II型和III型测试中所做的事情的描述是......难以理解。这提醒我,为什么我总是使用包:ez for ezANOVA()在我不得不容忍非I型测试的那天。