我试图在一篇论文中复制一个3因子嵌套ANOVA分析:Underwood,AJ(1993)空间复制抽样机制,以检测变量世界中的环境影响。
该示例的数据(来自表3,Underwood 1993)可以通过以下方式生成:
dat <-
structure(list(B = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = c("A", "B"), class = "factor"), C = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"),
Times = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
Locations = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L,
1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L,
2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L,
1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L), X = c(59L, 51L, 45L, 46L, 40L, 32L, 39L, 32L, 25L, 51L,
44L, 37L, 55L, 47L, 41L, 31L, 38L, 45L, 41L, 47L, 55L, 43L,
36L, 29L, 23L, 30L, 37L, 57L, 50L, 43L, 36L, 44L, 51L, 39L,
29L, 23L, 38L, 44L, 52L, 31L, 38L, 45L, 42L, 35L, 28L, 52L,
44L, 37L, 51L, 43L, 37L, 38L, 31L, 24L, 60L, 52L, 46L, 30L,
37L, 44L, 41L, 34L, 27L, 53L, 46L, 39L, 40L, 34L, 26L, 21L,
27L, 35L), Times.unique = structure(c(5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A_1", "A_2", "A_3",
"A_4", "B_1", "B_2", "B_3", "B_4"), class = "factor")), .Names = c("B",
"C", "Times", "Locations", "Y", "Times.unique"), row.names = c(NA,
-72L), class = "data.frame")
dat
数据框dat有4个因素:
B - 有两个级别“A”和“B”(在v之后)
时间 - 8个级别,4个在“B”之前,4个在“A”之后,在每个级别内编码为1:4。请注意,变量Times.unique是相同的,但每次都有一个唯一的代码(之前和之后)
地点 - 有三个级别,每次都在
之前和之后都进行测量C - 有两个级别控制(C)和(I)。注意:两个位置是控制,一个是影响
虽然我很清楚如何使用混合模型(lmer)分析这样的设计,但我想完全复制他的例子,以便我可以运行一些模拟来比较他的方法。
特别是我试图复制表4中“a”栏下的SS值。他适合具有以下术语的SS和df值的设计:
B - &gt; SS = 66.13,df = 1
时间(B) - &gt; SS = 280.64,df = 6
地点 - &gt; SS = 283.86,df = 2
B x位置 - &gt; SS = 29.26,df = 2
时间(B)x地点 - &gt; SS = 575.45,df = 12
残留 - &gt; SS = 2420.00,df = 48
总计 - &gt; SS = 6208.34,df = 71
我假设时间(B)术语代表嵌套在治疗前/后“B”中的时间。在这个例子中,他忽略了Locations来自对照和影响治疗,并完全不考虑因子C.
我已经尝试了所有可能的组合,我可以想到重现这个嵌套的anova,使用独特的Times编码和在B(之前和之后)中编码为1:4的Times。我尝试在%,/和Error()参数中使用%,以及从汽车中使用Anova来更改计算出的SS类型。 %in和/嵌套拟合的示例包括:
aov(Y~B+Locations+Times%in%B+B:Locations+Times%in%B:Locations, data=dat)
aov(Y~B+Locations+B/Times+B:Locations+B/Times:Locations, data=dat)
我似乎无法完全复制Underwood的SS值,特别是对于两个交互术语。朋友让我在statistix中拟合模型,其中SS值可以精确地再现,因此可以获得该模型的上述SS值。
任何人都可以帮我在R中使用这个模型吗?我希望将它嵌入到更大的模拟中,并且真的需要能够在R中运行模型,以便完全重现Underwood 1993 SS值吗?
答案 0 :(得分:1)
你的问题是dat$Locations
是一个整数,它应该是一个因子(三个唯一的位置)。一个提示是你的ANOVA系列认为Locations仅占用1 df,而Underwood则认为它只有2 df。
只需添加以下行:
dat$Locations = factor(dat$Locations)
然后您的代码行完美地再现了Underwood的结果:
aov(Y~B+Locations+B/Times+B:Locations+B/Times:Locations, data=dat)
#Call:
# aov(formula = Y ~ B + Locations + B/Times + B:Locations + B/Times:Locations,
# data = dat)
#
#Terms:
# B Locations B:Times B:Locations B:Locations:Times
#Sum of Squares 66.1250 2836.8611 280.6389 29.2500 575.4444
#Deg. of Freedom 1 2 6 2 12
# Residuals
#Sum of Squares 2420.0000
#Deg. of Freedom 48