事后测试线性混合模型lsmeans错误

时间:2016-01-07 11:12:32

标签: r lme4 lsmeans

我有一个关于在线性混合模型上运行事后测试的问题:

我在lme4中运行了一个线性混合模型,共有3组,每组5条蛇,每组不同的通风率(Vent),在不同的时间点进行测量({{1 }),将snake指定为随机效果(Time

以下数据子集:

ID

代码:

subset1 <- structure(list(ID = structure(c(5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 
9L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 10L, 10L, 
10L, 10L, 10L, 20L, 20L, 20L, 20L, 20L, 4L, 4L, 4L, 4L, 4L, 11L, 
11L, 11L, 11L, 11L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 12L, 
13L, 14L, 15L, 16L, 17L, 17L, 17L, 17L, 17L), .Label = c("", 
"1_1_2", "10", "10_1_1", "13_1_4", "14_2_4", "15_3_4", "16_1_4", 
"17_2_4", "2_2_1", "5", "5_2_2", "5_2_3", "5_2_4", "5_2_5", "5_2_6", 
"7_1_2", "8", "9", "9_3_1"), class = "factor"), Vent = c(30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 125L, 
125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 
125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 
125L, 125L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 
250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 250L, 
250L, 250L, 250L, 250L, 250L), Time = c(60L, 80L, 180L, 720L, 
1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 
60L, 80L, 180L, 720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 
80L, 180L, 720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 80L, 
180L, 720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 80L, 180L, 
720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 80L, 180L, 720L, 
1440L, 60L, 80L, 180L, 720L, 1440L, 60L, 80L, 180L, 720L, 1440L, 
60L, 80L, 180L, 720L, 1440L), corr.pO2 = c(224.1388673, 233.9456874, 
239.1553778, 107.2373336, 76.71835625, 164.6293748, 243.8501858, 
234.8205544, 71.74240501, 62.23789874, 69.69478654, 62.23789874, 
152.1142885, 79.61325688, 63.33285001, 240.8713061, 231.304842, 
222.7743953, 95.7912966, 64.41744793, 241.7255035, 238.2936023, 
138.1188987, 43.00663696, 50.64392111, 265.4973967, 274.0599252, 
285.0144919, 83.37647392, NA, 292.3660214, 281.6533627, 275.9747984, 
63.33285001, 56.59660394, 254.2521631, 222.3180596, 208.736288, 
88.83223104, 114.1782867, 208.255285, 232.1878564, 193.3861802, 
72.75355024, 60.01517133, 209.6956308, 245.9596884, 200.4342522, 
75.73874562, 67.61194011, 240.0146049, 261.1278627, 166.9318704, 
74.75152919, 73.75652657, 270.9724687, 251.7882317, 245.9596884, 
147.1396383, 50.64392111, 294.179467, 296.3431178, 284.6426934, 
73.75652657, 75.73874562, 233.0681297, 234.3834557, 143.3247511, 
73.75652657, 66.55672391, 245.9596884, 249.3041163, 223.6847954, 
92.35383362, 78.65544784)), .Names = c("ID", "Vent", "Time", 
"corr.pO2"), row.names = c(NA, 75L), class = "data.frame")

测试通风效果:

attach(subset1)

require(lme4)

with.vent = lmer(corr.pO2 ~ Vent * Time + (1|ID),REML = FALSE, data = subset1)

with.vent.no.int = lmer(corr.pO2 ~ Vent + Time + (1|ID),REML = FALSE, data = subset1)

anova(with.vent, with.vent.no.int)
#no significant interaction

与发泄比较:

without.vent = lmer(corr.pO2 ~ Time + (1|ID), REML = FALSE, data = subset1)

测试时间的影响:

anova(with.vent.no.int, without.vent)
#no significant effect of ventilation treatment p=0.09199

所以尝试事后测试:

without.time = lmer(corr.pO2 ~ Vent + (1|ID), data = subset1)

anova(with.vent.no.int, without.time)
# highly significant effect of time on pO2 < 2.2e-16 *** 

这是我收到错误的地方:

require(lsmeans)
lsmeans(with.vent.no.int, pairwise ~ Time, adjust = "tukey", data = subset1)

我可以使用以下方法运行成对测试:

Error in solve.default(L %*% V0 %*% t(L), L) : 
  Lapack routine dgesv: system is exactly singular: U[1,1] = 0

但是知道这不适用于其他变量有交互的地方(就像我的其他数据一样),以及我希望在每个时间点和通风制度之间进行成对比较的地方。这可能是pairwise.t.test(corr.pO2, Time, p.adj = "BH", paired = T) 吗?

感谢您的意见,我知道似然比测试本身就存在争议。我考虑过混合效应ANOVA,但是有一些缺失的数据点使得这不可能。该数据之前被另一名学生分析为双向anova,没有重复测量,但我的感觉是这是不合适的,因为每条蛇都是在重复的时间点测量的

2 个答案:

答案 0 :(得分:4)

答案很简单:您需要确保VentTime预测变量是因素。否则lsmeans会对成对测试的含义感到困惑。 (关于你是否真的想用连续预测器来分析这个模型,有一个稍微长一点的对话,即作为响应面设计而不是双向ANOVA ...)这里有一个稍微紧凑的版本分析:

subset1 <- transform(subset1,Vent=factor(Vent), Time=factor(Time))
require(lme4)
with.vent = lmer(corr.pO2 ~ Vent * Time + (1|ID),
       REML = FALSE, data = subset1)
drop1(with.vent,test="Chisq")  ## test interaction
with.vent.no.int = update(with.vent, . ~ . - Vent:Time)
drop1(with.vent.no.int,test="Chisq")  ## test main effects
require(lsmeans)
lsmeans(with.vent.no.int, pairwise ~ Time)

输出子集:

$contrasts
 contrast    estimate       SE    df t.ratio p.value
 60 - 80     -6.99222 12.76886 63.45  -0.548  0.9819
 60 - 180    14.74281 12.76886 63.45   1.155  0.7768
 60 - 720   147.27139 12.76886 63.45  11.534  <.0001
...

我确实同意错误信息是不可理解的。值得一提的是lsmeans维护者,看看是否可以检测并标记这个(非常常见的)错误。

答案 1 :(得分:3)

lsmeans的下一次更新(可能在2016年2月1日左右)会发现这种错误:

> lsmeans(with.vent.no.int, pairwise ~ Vent)

$lsmeans
     Vent   lsmean       SE    df lower.CL upper.CL
 135.1351 167.4871 6.859275 18.63 153.1111  181.863

Confidence level used: 0.95 

$contrasts


Warning message:
In contrast.ref.grid(result, method = contr, by, ...) :
  No contrasts were generated! Perhaps only one lsmean is involved.
  This can happen, for example, when your predictors are not factors.

ref.grid功能可以方便地了解您所拥有的内容:

> ref.grid(with.vent.no.int)
'ref.grid' object with variables:
    Vent = 135.14
    Time = 483.24

VentTime都是协变量,因此默认使用它们的平均值。要更改此设置,您不一定要更改数据集;你可以将预测因子强加给模型中的因素:

> repaired = lmer(corr.pO2 ~ factor(Vent) + factor(Time) + (1|ID), 
                  REML = FALSE, data = subset1)
> ref.grid(repaired)
'ref.grid' object with variables:
    Vent =  30, 125, 250
    Time =   60,   80,  180,  720, 1440

> lsmeans(repaired, pairwise ~ Vent)
$lsmeans
 Vent   lsmean       SE    df lower.CL upper.CL
   30 146.0967 12.19373 18.16 120.4952 171.6981
  125 177.0917 12.29417 18.66 151.3274 202.8559
  250 173.2568 11.12879 26.72 150.4111 196.1024

Results are averaged over the levels of: Time 
Confidence level used: 0.95 

$contrasts
 contrast    estimate       SE    df t.ratio p.value
 30 - 125  -30.994975 17.31570 18.41  -1.790  0.2005
 30 - 250  -27.160077 16.50870 21.52  -1.645  0.2490
 125 - 250   3.834898 16.58302 21.81   0.231  0.9710

Results are averaged over the levels of: Time 
P value adjustment: tukey method for comparing a family of 3 estimates