我无法通过glm()
函数计算概率回归中ED(LD)参数的标准误差估计值。我得到的结果并不包含当前ED水平的标准误差(使用学生t检验估算2种不同的选择是必要的)。
我使用以下功能(从期刊文章中获得):
LD <- function(r, n, d, conf.level) {
## Set up a number series
p <- seq(1, 99, 1)
## r=number responding, n=number treated, d=dose (untransformed), confidence interval level,
mod <- glm(cbind(r, (n-r)) ~ log10(d), family = binomial(link=probit))
### Calculate heterogeneity correction to confidence intervals according to Finney, 1971, (p.
### 72, eq. 4.27; also called "h")
het = deviance(mod)/df.residual(mod)
if(het < 1){het = 1} ### Heterogeneity cannot be less than 1
## Extract slope and intercept
summary <- summary(mod, dispersion=het, cor = F)
intercept <- summary$coefficients[1]
interceptSE <- summary$coefficients[3]
slope <- summary$coefficients[2]
slopeSE <- summary$coefficients[4]
z.value <- summary$coefficients[6]
N <- sum(n)
## Intercept (alpha)
b0<-intercept
## Slope (beta)
b1<-slope
## Slope variance
vcov = summary(mod)$cov.unscaled
var.b0<-vcov[1,1]
## Intercept variance
var.b1<-vcov[2,2]
## Slope intercept covariance
cov.b0.b1<-vcov[1,2]
## Adjust alpha depending on heterogeneity (Finney, 1971, p. 76)
alpha<-1-conf.level
if(het > 1) {talpha <- -qt(alpha/2, df=df.residual(mod))} else {talpha <- -qnorm(alpha/2)}
## Calculate g (Finney, 1971, p 78, eq. 4.36)
## "With almost all good sets of data, g will be substantially smaller than 1.0 and
## seldom greater than 0.4."
g <- het * ((talpha^2 * var.b1)/b1^2)
## Calculate theta.hat for all LD levels based on probits in eta (Robertson et al., 2007, pg.
## 27; or "m" in Finney, 1971, p. 78)
eta = family(mod)$linkfun(p/100) #probit distribution curve
theta.hat <- (eta - b0)/b1
## Calculate correction of fiducial limits according to Fieller method (Finney, 1971,
## p. 78-79. eq. 4.35)
const1 <- (g/(1-g))*(theta.hat + cov.b0.b1/var.b1) # const1 <- (g/(1-g))*(theta.hat - cov.b0.b1/var.b1)
const2a <- var.b0 + 2*cov.b0.b1*theta.hat + var.b1*theta.hat^2 - g*(var.b0 - (cov.b0.b1^2/var.b1))
const2 <- talpha/((1-g)*b1) * sqrt(het * (const2a))
## Calculate the confidence intervals LCL=lower, UCL=upper (Finney, 1971, p. 78-79. eq. 4.35)
LCL <- (theta.hat + const1 - const2)
UCL <- (theta.hat + const1 + const2)
## Calculate variance for theta.hat (Robertson et al., 2007, pg. 27)
var.theta.hat <- (1/(theta.hat^2)) * ( var.b0 + 2*cov.b0.b1*theta.hat + var.b1*theta.hat^2 )
## Make a data frame from the data at all the different values
ECtable <- data.frame(
"p"=p,
"N"=N,
"EC"=10^theta.hat,
"LCL"=10^LCL,
"UCL"=10^UCL,
"slope"=slope,
"slopeSE"=slopeSE,
"intercept"=intercept,
"interceptSE"=interceptSE,
"z.value"=z.value,
"chisquare"=deviance(mod),
"df"=df.residual(mod),
"h"=het,
"g"=g,
"theta.hat"=theta.hat,
"var.theta.hat"=var.theta.hat)
## Select output level
return(ECtable)
}
使用示例:
result <- LD(data$effected, data$total, data$dose, 0.95)
在结果中,我得到了估计的ED水平:
p N EC LCL UCL slope slopeSE intercept interceptSE z.value chisquare df h g theta.hat var.theta.hat
49 49 24 39.26365 32.92103 41.18617 30.92579 12.21311 -49.32049 19.69812 2.532179 0.8584527 3 1 0.5991111 1.593991 0.06240212 0.3076671
50 50 24 39.33701 33.16216 41.26921 30.92579 12.21311 -49.32049 19.69812 2.532179 0.8584527 3 1 0.5991111 1.594801 0.06060359 0.3004097
51 51 24 39.41050 33.40318 41.35474 30.92579 12.21311 -49.32049 19.69812 2.532179 0.8584527 3 1 0.5991111 1.595612 0.05888568 0.2933069
但目前的ED级别没有估计的标准误差,也没有此标准误差的自由度(df)。 有人知道如何计算当前ED水平的标准误差和df吗?你能帮我修改这个功能更有用吗?
在许多不同的数学统计和建模计算中,ED水平的std.error都是不同的:
Sm = 1/b*sqrt(Snw)
; SE(LD50) = 10^m * log(e)10 * Sm
V(m) = 1/b^2 * { 1/Snw + (m-x')^2/Snw(x-x')^2 }
SEed50 = (ED84 - ED16) / sqrt(2N)
,其中N
是分析组中动物的总数。答案 0 :(得分:0)
这并没有真正回答SE的问题。 相反,它是我问题的宝藏,我已经问过了。
### Columns of data: response(r) = detected; number_obs(n) = total; dose(d) = conc
mdf = data.frame(conc= c(50, 25,12.5,6.25,3.125, 1),
total= c(10, 10, 10 , 10, 10, 4),
detected=c(10, 10, 10 , 8, 7, 0) )
# model ( r n - r ) ~ d
m3 =glm(cbind(detected, (total-detected)) ~ conc, family=binomial(link="probit"), data=mdf)
尝试使用原始数据作为二进制1/0或T / F(上面的&#34;也可以&#34;)
expand_probit_data <- function(mdf) {
md_binary = mdf[0, c("conc", "detected")]
for (x in 1:nrow(mdf)){
n = mdf[x,"total"] ; r = mdf[x, "detected"] ; d= mdf[x, "conc"]
md_binary <- rbind(md_binary, data.frame(conc=rep(d, n), detected = c(rep(1,r),rep(0,n-r))))
} # binary data can be T/F or TRUE/FALSE or 1 / 0 # but NOT character or levels e.g. "DET" / "NDET"
return (md_binary)
} # this will expand the short table to long binary form (somewhat reverse of : xtabs(~conc+detected, md_binary))
md_binary <- expand_probit_data(mdf)
# model r ~ d
m4 =glm(detected ~ conc, family=binomial(link="probit"), data=md_binary)
现在预测概率值,p水平的浓度和置信限。这是从您的代码中真正提取的。感谢您的深入研究和提供参考。
get_estimate <- function (mod, p_level= c(0.1, 0.5, 0.9, 0.95, 0.99)) {
### Calculate heterogeneity correction to confidence intervals according to Finney, 1971, (p.72, eq. 4.27; also called "h")
het = deviance(mod)/df.residual(mod) ; if(het < 1){het = 1}
# Heterogeneity cannot be less than 1
# R sets dispersion paramerter 1 by default #so I use 1, change if needed
## Extract slope and intercept
summary <- summary(mod, dispersion= 1, cor = F) # summary(mod, dispersion= het, cor = F) # summary might change if het is lot > 1
intercept <- summary$coefficients[1] ; interceptSE <- summary$coefficients[3]
slope <- summary$coefficients[2] ; slopeSE <- summary$coefficients[4]
z.value <- summary$coefficients[6]
N <- sum(mdf$total) # or for m3 # N <- nrow(md_binary) #this needs to be fixed: getting data from outside the supplied variables
## Intercept (alpha) ## Slope (beta)
b0 <- intercept ; b1 <- slope
## Slope variance # Intercept variance # Slope intercept covariance
vcov = summary(mod)$cov.unscaled
var.b0<-vcov[1,1] ; var.b1<-vcov[2,2] ; cov.b0.b1<-vcov[1,2]
## Adjust alpha depending on heterogeneity (Finney, 1971, p. 76)
alpha= 0.05 # fixed, otherwise # 1-conf.level # e.g. if conf.level = 0.95
if(het > 1) {talpha = -qt(alpha/2, df=df.residual(mod))} else {talpha = -qnorm(alpha/2)}
## Calculate g (Finney, 1971, p 78, eq. 4.36) ## "With almost all good sets of data, g will be substantially smaller than 1.0 and seldom greater than 0.4."
g <- het * ((talpha^2 * var.b1)/b1^2)
## Estimate for all LD levels based on probits in eta ~~~~~~~~~~~|
## (Robertson et al., 2007, pg.27; or "m" in Finney, 1971, p. 78) |
eta = family(mod)$linkfun(p_level) # probit distribution curve | p_levels = c(0.5, 0.9, 0.95, 0.99)
eta_conc = (eta - b0)/b1 # returns the conc or dose at p level | b0 = intercept ; b1 = slope
##----- this section was critical to my calculation _______________| No way could I have got this estimate without this(@Mihail Pyatinskyi) post.
# term theta.hat replaced by eta_conc here, for my convenience
## Calculate correction of fiducial limits according to Fieller method (Finney, 1971, p. 78-79. eq. 4.35)
const1 <- (g/(1-g))*(eta_conc + cov.b0.b1/var.b1) # const1 <- (g/(1-g))*(eta_conc - cov.b0.b1/var.b1)
const2a <- var.b0 + 2*cov.b0.b1*eta_conc + var.b1*eta_conc^2 - g*(var.b0 - (cov.b0.b1^2/var.b1))
const2 <- talpha/((1-g)*b1) * sqrt(het * (const2a))
## Calculate the confidence intervals LCL=lower, UCL=upper (Finney, 1971, p. 78-79. eq. 4.35)
LCL <- (eta_conc + const1 - const2)
UCL <- (eta_conc + const1 + const2)
##
# Calculate variance (Robertson et al., 2007, pg. 27)
var.eta_conc <- (1/(eta_conc^2)) * ( var.b0 + 2*cov.b0.b1*eta_conc + var.b1*eta_conc^2 )
xtxt = cbind(p_level,conc_est=eta_conc, LCL=LCL, UCL=UCL, variance=var.eta_conc, sd.eta_conc=sqrt(var.eta_conc))
return (xtxt)
}
0.5用于EC50或ED50计算,就像你的(我猜) 人们根据具体情况使用EC90(0.9)或LD10(0.1)等。 我专门用于计算概率为0.95的LLOD(检测下限,分子检测分析)。此脚本输出与SPSS的Probit计算非常接近。
p_level = c(0.5, 0.9, 0.95, 0.99) # predict at these probability levels
get_estimate(m3, p_level)
希望你现在自己想出SE。您已经进行了方差计算,我使用sqrt来获得标准偏差。研究了这个示例数据。