我正在尝试在R中转换以下SAS代码,以获得与SAS相同的结果。这是SAS代码:
DATA plants;
INPUT sample $ treatmt $ y ;
cards;
1 trt1 6.426264755
1 trt1 6.95419631
1 trt1 6.64385619
1 trt2 7.348728154
1 trt2 6.247927513
1 trt2 6.491853096
2 trt1 2.807354922
2 trt1 2.584962501
2 trt1 3.584962501
2 trt2 3.906890596
2 trt2 3
2 trt2 3.459431619
3 trt1 2
3 trt1 4.321928095
3 trt1 3.459431619
3 trt2 3.807354922
3 trt2 3
3 trt2 2.807354922
4 trt1 0
4 trt1 0
4 trt1 0
4 trt2 0
4 trt2 0
4 trt2 0
;
RUN;
PROC MIXED ASYCOV NOBOUND DATA=plants ALPHA=0.05 method=ML;
CLASS sample treatmt;
MODEL y = treatmt ;
RANDOM int treatmt/ subject=sample ;
RUN;
我从SAS得到以下协方差估计值:
拦截样本==> 5.5795
treatmt sample ==> -0.08455
剩余==> 0.3181
我在R中尝试了以下内容,但是得到了不同的结果。
s=as.factor(sample)
lmer(y~ 1+treatmt+(1|treatmt:s),REML=FALSE)
我感谢任何帮助。
谢谢,
Gram
答案 0 :(得分:2)
我不知道你是否能够从SAS到R获得确切的结果,但是我能够通过处理contrast
来完成这里所述:
lmer for SAS PROC MIXED Users:第6页
比较SAS PROC MIXED和lmer one产生的估算值 必须小心考虑用于定义的对比 因素的影响。在SAS中,具有拦截和定性的模型 因子是根据截距和指标来定义的 除了因子的最后一个级别之外的所有变量。默认 S中的行为是使用Helmert对比因子。在一个 平衡因子这些提供了一组正交对比。在R中 默认是“治疗”对比,几乎与之相同 SAS参数化除了它们丢弃第一个指标 等级,而不是最后一个等级。如有疑问,请检查哪些对比 与对比功能一起使用。为了使比较更容易, 你可能觉得值得宣布
在会话开始时
options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
。
dput:
df <- structure(list(sample = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L),
treatmt = c("trt1", "trt1", "trt1", "trt2", "trt2", "trt2",
"trt1", "trt1", "trt1", "trt2", "trt2", "trt2", "trt1", "trt1",
"trt1", "trt2", "trt2", "trt2", "trt1", "trt1", "trt1", "trt2",
"trt2", "trt2"), y = c(6.426264755, 6.95419631, 6.64385619,
7.348728154, 6.247927513, 6.491853096, 2.807354922, 2.584962501,
3.584962501, 3.906890596, 3, 3.459431619, 2, 4.321928095,
3.459431619, 3.807354922, 3, 2.807354922, 0, 0, 0, 0, 0,
0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-24L), .Names = c("sample", "treatmt", "y"))
当前代码:
options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
df$sample=as.factor(df$sample)
lmer(y~ 1+treatmt+(1|treatmt:sample),REML=FALSE, data = df)
当前输出:
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: y ~ 1 + treatmt + (1 | treatmt:sample)
Data: df
AIC BIC logLik deviance df.resid
80.3564 85.0686 -36.1782 72.3564 20
Random effects:
Groups Name Std.Dev.
treatmt:sample (Intercept) 2.344
Residual 0.564
Number of obs: 24, groups: treatmt:sample, 8
Fixed Effects:
(Intercept) treatmttrt1
3.3391 -0.1072
答案 1 :(得分:0)
您正在使用SAS选项NOBOUND
,该选项允许对方差进行负估计,并且得到负估计。对于lmer
,这是不可能的,它限制了方差为正。
我们可以尝试手动获取SAS结果。首先,请注意,等效的lmer
语法是:
lmer(y ~ 1 + treatment + (1+treatment|sample), REML=FALSE, data = dat)
让对数似然最大化,允许出现负方差:
dattxt <- "1 trt1 6.426264755
1 trt1 6.95419631
1 trt1 6.64385619
1 trt2 7.348728154
1 trt2 6.247927513
1 trt2 6.491853096
2 trt1 2.807354922
2 trt1 2.584962501
2 trt1 3.584962501
2 trt2 3.906890596
2 trt2 3
2 trt2 3.459431619
3 trt1 2
3 trt1 4.321928095
3 trt1 3.459431619
3 trt2 3.807354922
3 trt2 3
3 trt2 2.807354922
4 trt1 0
4 trt1 0
4 trt1 0
4 trt2 0
4 trt2 0
4 trt2 0
"
dat <- read.table(text = dattxt)
names(dat) <- c("sample", "treatment", "y")
dat$sample <- as.factor(dat$sample)
opts <- options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
library(lme4)
fit <- lmer(y ~ 1 + treatment + (1+treatment|sample), REML=FALSE, data = dat)
# marginal variance matrix in function of variance components
Vfun <- function(fit, vcs){
Z <- getME(fit, "Z")
n <- getME(fit, "n")
l_i <- getME(fit, "l_i")
sigma2_a <- vcs[1]
sigma2_b <- vcs[2]
sigma_ab <- vcs[3]
sigma2 <- vcs[4]
G <- matrix(c(sigma2_a, sigma_ab, sigma_ab, sigma2_b), nrow = 2)
R <- Diagonal(n, sigma2)
Z %*% bdiag(rep(list(G),l_i)) %*% t(Z) + R
}
# minus log-likelihood
library(mvtnorm)
logLHD <- function(params, fit){
X <- getME(fit, "X")
beta <- params[1:ncol(X)]
y <- getME(fit, "y")
vcs <- tail(params, length(params)-ncol(X))
V <- as.matrix(Vfun(fit, vcs))
if(any(eigen(V)$values <= 0)){
return(runif(1, 1e7, 1e8)) # return a high-value if V is not positive
}
-dmvnorm(y, c(X%*%beta), sigma = V, log = TRUE)
}
# optimization of log-likelihood
library(dfoptim)
start <-
c(fixef(fit), vc$sample[1,1], vc$sample[2,2], vc$sample[1,2], sigma(fit)^2)
names(start)[3:6] <-
c("sample.Intercept", "sample.trt1", "covariance", "sigma2")
opt <- hjkb(start, logLHD, lower=c(-Inf,-Inf,-Inf,-Inf,-Inf,0), fit=fit)
### results
opt$par
# (Intercept) treatmenttrt1 sample.Intercept sample.trt1 covariance sigma2
# 3.33912840 -0.10721533 5.50671885 -0.16909628 0.07275635 0.31812378
残差与使用SAS获得的残差相同。要获得其他SAS结果,必须对我们的结果进行一些体操训练,我不知道为什么,但是我们以这种方式得到它们:
### SAS results
opt$par[["sample.Intercept"]] + opt$par[["covariance"]]
# 5.579475
opt$par[["sample.trt1"]] / 2
# -0.08454814
请注意,使用负方差确实可以更好地最大化对数似然性:
### remark: lmer achieves a lower log-likelihood
logLik(fit)
# 'log Lik.' -27.88947 (df=6)
-opt$value
# -26.43355
如果有人能解释所需的体操技巧,我将不胜感激。
对不起,这不是一个好的模型。该模型是:
lmer(y ~ 1 + treatment + (1|sample/treatment), REML=FALSE, data = dat)
这是SAS结果:
opts <- options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
library(lme4)
fit <- lmer(y ~ 1+treatment+(1|sample/treatment), REML=FALSE, data = dat)
vc <- VarCorr(fit)
Vfun <- function(fit, vcs){
Z <- getME(fit, "Z")
n <- getME(fit, "n")
l_i <- getME(fit, "l_i")
G <- Diagonal(sum(l_i), rep(vcs[1:2], l_i))
R <- Diagonal(n, vcs[3])
Z %*% G %*% t(Z) + R
}
library(mvtnorm)
logLHD <- function(params, fit){
X <- getME(fit, "X")
beta <- params[1:ncol(X)]
y <- getME(fit, "y")
vcs <- tail(params, length(params)-ncol(X))
V <- as.matrix(Vfun(fit, vcs))
if(any(eigen(V)$values <= 0)) return(runif(1, 1e7, 1e8))
-dmvnorm(y, c(X%*%beta), sigma = V, log = TRUE)
}
library(dfoptim)
start <- c(fixef(fit), vc[[1]], vc[[2]], sigma(fit)^2)
opt <- hjkb(start, logLHD, lower=c(-Inf,-Inf,-Inf,-Inf,0), fit=fit)
opt$par[3:5]
# -0.08454877 5.57947601 0.31812697