手动计算Kaplan-Meier估计量

时间:2016-08-19 16:18:58

标签: r survival-analysis

我认为这将是一件微不足道的事情,但我仍然在调整编写代码而不是指向和点击电子表格时遇到一些麻烦。

month = as.integer(c(1,2,3,4,5,6,7,8,9,10,11,12))
remaining = c(1000,925,852,790,711,658,601,567,530,501,485,466)
left = c(75, 73, 62, 79, 53, 57, 34, 37, 29, 16, 19, 0)
KPdata = data.frame(month, remaining, left)

> KPdata
   month remaining left    
1      1      1000   75 
2      2       925   73 
3      3       852   62 
4      4       790   79 
5      5       711   53 
6      6       658   57 
7      7       601   34
8      8       567   37 
9      9       530   29 
10    10       501   16 
11    11       485   19 
12    12       466   12 

如何计算每个月的Kaplan-Meier生存函数?请注意,我想手动执行此操作,我知道有些软件包可以帮我完成。

2 个答案:

答案 0 :(得分:3)

我认为这就是你要做的。我们使用lagcumprod来获取手动KM估算工具:

KPdata$KM_init <- lag((KPdata$remaining - KPdata$left) / KPdata$remaining)
KPdata[1,ncol(KPdata)] <- 1
KPdata$KM_final <- cumprod(KPdata$KM_init)

KPdata
   month remaining left   KM_init KM_final
1      1      1000   75 1.0000000    1.000
2      2       925   73 0.9250000    0.925
3      3       852   62 0.9210811    0.852
4      4       790   79 0.9272300    0.790
5      5       711   53 0.9000000    0.711
6      6       658   57 0.9254571    0.658
7      7       601   34 0.9133739    0.601
8      8       567   37 0.9434276    0.567
9      9       530   29 0.9347443    0.530
10    10       501   16 0.9452830    0.501
11    11       485   19 0.9680639    0.485
12    12       466    0 0.9608247    0.466

或者,我认为有一种不同形式的KM估算器看起来像这样(注意我添加了一行对应month = 0):

month = as.integer(c(0,1,2,3,4,5,6,7,8,9,10,11,12))
remaining = c(1000,1000,925,852,790,711,658,601,567,530,501,485,466)
left = c(0,75, 73, 62, 79, 53, 57, 34, 37, 29, 16, 19, 0)
KPdata2 = data.frame(month, remaining, left)

KPdata2$KM_init <- (KPdata2$remaining - KPdata2$left) / KPdata2$remaining
KPdata2$KM_final <- cumprod(KPdata2$KM_init)

KPdata2
   month remaining left   KM_init KM_final
1      0      1000    0 1.0000000    1.000
2      1      1000   75 0.9250000    0.925
3      2       925   73 0.9210811    0.852
4      3       852   62 0.9272300    0.790
5      4       790   79 0.9000000    0.711
6      5       711   53 0.9254571    0.658
7      6       658   57 0.9133739    0.601
8      7       601   34 0.9434276    0.567
9      8       567   37 0.9347443    0.530
10     9       530   29 0.9452830    0.501
11    10       501   16 0.9680639    0.485
12    11       485   19 0.9608247    0.466
13    12       466    0 1.0000000    0.466

答案 1 :(得分:0)

我对这个问题和@bouncyball的鼓舞人心的答案非常关注,我想我会加上我的ha'penny值得尝试处理审查。这是出于原始问题的精神 - “手工”做事来制定关键见解。

## rename remaining -> survived; left -> died 
month = as.integer(c(1,2,3,4,5,6,7,8,9,10,11,12))
survived = c(1000,925,852,790,711,658,601,567,530,501,485,466)
died = c(75, 73, 62, 79, 53, 57, 34, 37, 29, 16, 19, 0)

## arbitrary censoring @ 10 per time period
censored <- c(0, rep(10,11))
KPdata3 = data.frame(month, at.risk, censored, died, survived) 

## define those at risk <= those who survived
## awful bit of R fiddling for (something simple like) offsetting the index in base R
len <- length(month)
at.risk <- c(survived[1], 
             survived[-len] - died[-len] - cumsum(censored[-len]) )
## note use of cumsum()

## censoring uses at risk, rather than survived/remained
KPdata3$KM_increment <- (KPdata3$at.risk - KPdata3$died)/ KPdata3$at.risk
## code credit to @bouncyball
KPdata3$KM_cumulative <- cumprod(KPdata3$KM_increment)
KPdata3

给这个......

   month at.risk censored died survived KM_increment KM_cumulative
1      1    1000        0   75     1000    0.9250000     0.9250000
2      2     925       10   73      925    0.9210811     0.8520000
3      3     842       10   62      852    0.9263658     0.7892637
4      4     770       10   79      790    0.8974026     0.7082873
5      5     681       10   53      711    0.9221733     0.6531636
6      6     618       10   57      658    0.9077670     0.5929203
7      7     551       10   34      601    0.9382940     0.5563336
8      8     507       10   37      567    0.9270217     0.5157333
9      9     460       10   29      530    0.9369565     0.4832197
10    10     421       10   16      501    0.9619952     0.4648551
11    11     395       10   19      485    0.9518987     0.4424949
12    12     366       10    0      466    1.0000000     0.4424949

设置rep(0,11)会给出与@ bouncyball相同的答案。