具有时间相关协变量的Cox模型的C统计量和95%置信区间

时间:2018-12-18 22:51:21

标签: r cox-regression

我正在使用与时间相关的协变量进行Cox回归。我对计算Concordance指数的95%置信区间特别感兴趣。但是,coxph模型的标准摘要仅返回Concordance索引及其标准误差。是否有可能也获得95%的CI?

谢谢!

library(survival)

temp <- subset(pbc, id <= 312, select=c(id:sex, stage))
pbc2 <- tmerge(temp, temp, id=id, death = event(time, status)) #set range
pbc2 <- tmerge(pbc2, pbcseq, id=id, ascites = tdc(day, ascites),
bili = tdc(day, bili), albumin = tdc(day, albumin),
protime = tdc(day, protime), alk.phos = tdc(day, alk.phos))
fit2 <- coxph(Surv(tstart, tstop, death==2) ~ log(bili) + log(protime), pbc2)

summary(fit2)

coxph(formula = Surv(tstart, tstop, death == 2) ~ log(bili) + 
    log(protime), data = pbc2)

  n= 1807, number of events= 125 

                 coef exp(coef) se(coef)      z Pr(>|z|)    
log(bili)     1.24121   3.45981  0.09697 12.800   <2e-16 ***
log(protime)  3.98340  53.69929  0.43589  9.139   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

             exp(coef) exp(-coef) lower .95 upper .95
log(bili)         3.46    0.28903     2.861     4.184
log(protime)     53.70    0.01862    22.853   126.181

**Concordance= 0.886  (se = 0.029 )**
Rsquare= 0.168   (max possible= 0.508 )
Likelihood ratio test= 332.1  on 2 df,   p=<2e-16
Wald test            = 263.3  on 2 df,   p=<2e-16
Score (logrank) test = 467.8  on 2 df,   p=<2e-16

使用自举程序中的RMS软件包中的验证功能来获取C索引的95%CI是否有意义?我想出了以下代码。你怎么看?但是,我不确定如何正确地处理来自训练/测试列的Dxy值(训练的CI对我而言似乎不错,而来自测试列的CI看起来非常狭窄)。

library(survival)
library(rms)
library(tidyboot)

temp <- subset(pbc, id <= 312, select=c(id:sex, stage))
pbc2 <- tmerge(temp, temp, id=id, death = event(time, status)) #set range
pbc2 <- tmerge(pbc2, pbcseq, id=id, ascites = tdc(day, ascites),
bili = tdc(day, bili), albumin = tdc(day, albumin),
protime = tdc(day, protime), alk.phos = tdc(day, alk.phos))
fit2 <- cph(Surv(tstart, tstop, death==2) ~ log(bili) + log(protime), pbc2, x=T, y=T, surv=T)
set.seed(1)
output <- capture.output(validate(fit2, method="boot", B=1000, dxy=T, pr =T))
head(output)
output <- as.matrix(output)
output_dxy <- as.matrix(output[grep('^Dxy', output[,1]),])
output_dxy <- gsub("(?<=[\\s])\\s*|^\\s+|\\s+$", "", output_dxy, perl=TRUE)
train <- abs(as.numeric(lapply(strsplit(output_dxy, split=" "), "[", 2))[1:1000])/2+0.5
test <- abs(as.numeric(lapply(strsplit(output_dxy, split=" "), "[", 3))[1:1000])/2+0.5
summary(train)
summary(test)
ci_lower(train, na.rm = FALSE)
ci_upper(train, na.rm = FALSE)
ci_lower(test, na.rm = FALSE)
ci_upper(test, na.rm = FALSE)

1 个答案:

答案 0 :(得分:0)

顺便说一句,在log bili和log protime中,这种关系不太可能是线性的。日志中的样条函数是必需的。

在使用0.886的一致性概率估计值之前,您需要从R survival包中进行验证

  • 该估算值旨在处理时间相关的协变量
  • 标准误差说明了估计两个回归系数的不确定性

如果这两个条件都满足,则可以使用+1.96 se来获得c指数的粗糙 0.95置信区间。