R Flexsurv和时间相关的协变量

时间:2018-04-03 12:53:51

标签: r survival-analysis

根据Christopher Jackson(2016)[“flexsurv:R中的参数生存建模平台,统计软件期刊,70(1)],我读到R flexsurv软件包也可用于建模时间相关的协变量。 。

然而,即使经过在线论坛的多次调整和搜索,我也无法弄明白。

在转向估计与时间相关的协变量之前,我尝试创建一个只有时间无关协变量的简单模型来测试我是否正确指定了Surv对象。这是一个小例子。

library(splitstackshape)
library(flexsurv)

## create sample data
n=50
set.seed(2)
t <- rpois(n,15)+1
x <- rnorm(n,t,5)
df <- data.frame(t,x)
df$id <- 1:n
df$rep <- df$t-1

看起来像这样:

  t         x id rep
1 12 17.696149  1  11
2 12 20.358094  2  11
3 11  2.058789  3  10
4 16 26.156213  4  15
5 13  9.484278  5  12
6 15 15.790824  6  14
...

长篇数据:

long.df <- expandRows(df, "rep")

rep.vec<-c()
for(i in 1:n){
  rep.vec <- c(rep.vec,1:(df[i,"t"]-1))
}

long.df$start <- rep.vec 
long.df$stop  <- rep.vec +1
long.df$censrec <- 0
long.df$censrec<-ifelse(long.df$stop==long.df$t,1,long.df$censrec)

看起来像这样:

     t         x id start stop censrec
1    12 17.69615  1     1    2       0
1.1  12 17.69615  1     2    3       0
1.2  12 17.69615  1     3    4       0
1.3  12 17.69615  1     4    5       0
1.4  12 17.69615  1     5    6       0
1.5  12 17.69615  1     6    7       0
1.6  12 17.69615  1     7    8       0
1.7  12 17.69615  1     8    9       0
1.8  12 17.69615  1     9   10       0
1.9  12 17.69615  1    10   11       0
1.10 12 17.69615  1    11   12       1
2    12 20.35809  2     1    2       0

...

现在我可以估算一个简单的Cox模型,看它是否有效:

coxph(Surv(t)~x,data=df)

这会产生:

    coef exp(coef) se(coef)     z     p
x -0.0588    0.9429   0.0260 -2.26 0.024

以长格式:

coxph(Surv(start,stop,censrec)~x,data=long.df)

我明白了:

     coef exp(coef) se(coef)     z     p
x -0.0588    0.9429   0.0260 -2.26 0.024

总之,我得出的结论是,我对长格式的转换是正确的。现在,转向flexsurv框架:

flexsurvreg(Surv(time=t)~x,data=df, dist="weibull")

的产率:

Estimates: 
       data mean  est       L95%      U95%      se        exp(est)  L95%      U95%    
shape        NA    5.00086   4.05569   6.16631   0.53452        NA        NA        NA
scale        NA   13.17215  11.27876  15.38338   1.04293        NA        NA        NA
x      15.13380    0.01522   0.00567   0.02477   0.00487   1.01534   1.00569   1.02508

但是

flexsurvreg(Surv(start,stop,censrec) ~ x ,data=long.df, dist="weibull")

导致错误:

Error in flexsurvreg(Surv(start, stop, censrec) ~ x, data = long.df, dist = "weibull") : 
  Initial value for parameter 1 out of range

有人会碰巧知道后一个Surv对象的正确语法吗?如果您使用正确的语法,您会得到相同的估计吗?

非常感谢, 最好, 大卫

===============

从42后反馈编辑

===============

library(splitstackshape)
library(flexsurv)
x<-c(8.136527,  7.626712,  9.809122, 12.125973, 12.031536, 11.238394,  4.208863,  8.809854,  9.723636)
t<-c(2, 3, 13,  5,  7, 37 ,37,  9,  4)

df <- data.frame(t,x)

#transform into long format for time-dependent covariates
df$id <- 1:length(df$t)
df$rep <- df$t-1
long.df <- expandRows(df, "rep")

rep.vec<-c()
for(i in 1:length(df$t)){
    rep.vec <- c(rep.vec,1:(df[i,"t"]-1))
}

long.df$start <- rep.vec 
long.df$stop  <- rep.vec +1
long.df$censrec <- 0
long.df$censrec<-ifelse(long.df$stop==long.df$t,1,long.df$censrec)


coxph(Surv(t)~x,data=df)
coxph(Surv(start,stop,censrec)~x,data=long.df)

flexsurvreg(Surv(time=t)~x,data=df, dist="weibull")
flexsurvreg(Surv(start,stop,censrec) ~ x ,data=long.df, dist="weibull",inits=c(shape=.1, scale=1))

对于两个coxph模型都得出相同的估计值,但是

Call:
flexsurvreg(formula = Surv(time = t) ~ x, data = df, dist = "weibull")

Estimates: 
       data mean  est       L95%      U95%      se        exp(est)  L95%      U95%    
shape        NA     1.0783    0.6608    1.7594    0.2694        NA        NA        NA
scale        NA    27.7731    3.5548  216.9901   29.1309        NA        NA        NA
x        9.3012    -0.0813   -0.2922    0.1295    0.1076    0.9219    0.7466    1.1383

N = 9,  Events: 9,  Censored: 0
Total time at risk: 117
Log-likelihood = -31.77307, df = 3
AIC = 69.54614

Call:
flexsurvreg(formula = Surv(start, stop, censrec) ~ x, data = long.df, 
    dist = "weibull", inits = c(shape = 0.1, scale = 1))

Estimates: 
       data mean  est       L95%      U95%      se        exp(est)  L95%      U95%    
shape        NA     0.8660    0.4054    1.8498    0.3353        NA        NA        NA
scale        NA    24.0596    1.7628  328.3853   32.0840        NA        NA        NA
x        8.4958    -0.0912   -0.3563    0.1739    0.1353    0.9128    0.7003    1.1899

N = 108,  Events: 9,  Censored: 99
Total time at risk: 108
Log-likelihood = -30.97986, df = 3
AIC = 67.95973

2 个答案:

答案 0 :(得分:1)

阅读错误讯息:

  

flexsurvreg出错(Surv(开始,停止,censrec)~x,data = long.df,dist =“weibull”,:     初始值必须是数字向量

然后阅读帮助页面?flexsurvreg,似乎应该尝试将inits的值设置为命名数字向量:

flexsurvreg(Surv(start,stop,censrec) ~ x ,data=long.df, dist="weibull", inits=c(shape=.1, scale=1))
Call:
flexsurvreg(formula = Surv(start, stop, censrec) ~ x, data = long.df, 
    dist = "weibull", inits = c(shape = 0.1, scale = 1))

Estimates: 
       data mean  est       L95%      U95%      se        exp(est)  L95%      U95%    
shape        NA    5.00082   4.05560   6.16633   0.53454        NA        NA        NA
scale        NA   13.17213  11.27871  15.38341   1.04294        NA        NA        NA
x      15.66145    0.01522   0.00567   0.02477   0.00487   1.01534   1.00569   1.02508

N = 715,  Events: 50,  Censored: 665
Total time at risk: 715
Log-likelihood = -131.5721, df = 3
AIC = 269.1443

非常相似的结果。我的猜测基本上是在黑暗中刺伤,所以如果除了“扩大搜索”之外没有成功,我没有指导如何做出选择。

答案 1 :(得分:0)

我只想在 flexsurv v1.1.1中运行以下代码:

flexsurvreg(Surv(start,stop,censrec) ~ x ,data=long.df, dist="weibull")

不返回任何错误。它也提供与非时变命令相同的估算值

flexsurvreg(Surv(time=t)~x,data=df, dist="weibull")

enter image description here