R:绘制生存曲线,其中死亡类型为堆积面积图

时间:2015-08-05 10:25:44

标签: r plot duration survival-analysis

我有一个正确审查的数据集,其中包含给定样本的生命时间和不同类型的死亡信息,我想生成一个生存曲线图(实际值将从样本计算而不是从模型估计)与不同类型的死亡作为堆积面积图,如下所示: desired plot

我如何在R?

中完成此任务

数据集看起来像这样:

var ab = new ArrayBuffer(bytes.length); //bytes is the array with the integer
var ia = new Uint8Array(ab);

for (var i = 0; i < bytes.length; i++) {
  ia[i] = bytes[i];
}

var blob = new Blob([ia], {type: "application/octet-stream"});
saveAs(blob, id + "_<?php echo $report['md5']; ?>.bin");

如果NA的死亡类型表示删失数据,则时间是死亡时间或审查时间,死亡者为0,被审查者为0。 (这是&#39; survfit&#39;所需的格式,但我也将其作为日期的实际开始和结束时间)

(现在,只有50分,就不可能构建这样的曲线,但数据中有更多的行不适合这里)。

1 个答案:

答案 0 :(得分:1)

这是一个丑陋的代码,但它得到了这个想法。我没有花时间弄清楚如何添加图例。还请注意,这种形象虽然在概念上很有趣,但并不一定能反映KM曲线。说实话,如果您要以这种方式呈现数据,那么在固定时间点将它作为堆叠条形更有意义。

请注意,我很确定此代码中存在一些漏洞。它没有保修,但可能会让你开始。

SurvData <- structure(list(row.names = c("", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", ""), death = 1:50, type = c("Type3", 
NA, "Type3", NA, NA, NA, NA, NA, "Type3", NA, NA, NA, "Type3", 
"Type3", "Type3", NA, "Type1", NA, NA, NA, "Type3", NA, "Type3", 
"Type1", NA, NA, "Type3", "Type2", NA, NA, NA, NA, "Type3", NA, 
NA, "Type3", NA, NA, NA, "Type3", NA, "Type3", NA, NA, "Type3", 
NA, NA, "Type3", "Type3", NA), time = c(81L, 868L, 1022L, 868L, 
868L, 868L, 868L, 887L, 156L, 868L, 868L, 868L, 354L, 700L, 632L, 
868L, 308L, 1001L, 1054L, 1059L, 120L, 732L, 543L, 379L, 613L, 
1082L, 226L, 1L, 976L, 1000L, 706L, 1015L, 882L, 1088L, 642L, 
953L, 1068L, 819L, 1029L, 34L, 1082L, 498L, 923L, 1041L, 321L, 
557L, 628L, 197L, 155L, 955L), event = c(1L, 0L, 1L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 
0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 
0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L)), .Names = c("row.names", 
"death", "type", "time", "event"), class = "data.frame", row.names = c(NA, 
-50L))

library(dplyr)
library(zoo)
library(RColorBrewer)
SurvDataSummary <- 
  arrange(SurvData, time, type) %>%
  mutate(type = ifelse(is.na(type), "Alive", type)) %>%
  group_by(time) %>%
  #* Count the number of each type at each time point
  summarise(n_at_time = n(),
            alive_at_time = sum(type == "Alive"),
            type1_at_time = sum(type == "Type1"),
            type2_at_time = sum(type == "Type2"),
            type3_at_time = sum(type == "Type3")) %>%
  ungroup() %>%
  mutate(n_alive = sum(n_at_time) - cumsum(lag(n_at_time, default = 0)),
         #* Proportion of each type
         p_type1_at_time = type1_at_time / n_alive,
         p_type2_at_time = type2_at_time / n_alive,
         p_type3_at_time = type3_at_time / n_alive,
         #* convert 0 to NA
         p_type1_at_time = ifelse(p_type1_at_time == 0, NA, p_type1_at_time),
         p_type2_at_time = ifelse(p_type2_at_time == 0, NA, p_type2_at_time),
         p_type3_at_time = ifelse(p_type3_at_time == 0, NA, p_type3_at_time),
         #* Back fill NAs with last known value
         p_type1_at_time = na.locf(p_type1_at_time, FALSE),
         p_type2_at_time = na.locf(p_type2_at_time, FALSE),
         p_type3_at_time = na.locf(p_type3_at_time, FALSE),
         #* make leading NAs 0
         p_type1_at_time = ifelse(is.na(p_type1_at_time), 0, p_type1_at_time),
         p_type2_at_time = ifelse(is.na(p_type2_at_time), 0, p_type2_at_time),
         p_type3_at_time = ifelse(is.na(p_type3_at_time), 0, p_type3_at_time),
         #* Calculate cumulative proportions
         p_alive_at_time = 1 - p_type1_at_time - p_type2_at_time - p_type3_at_time,
         cump_type1_at_time = p_alive_at_time + p_type1_at_time,
         cump_type2_at_time = cump_type1_at_time + p_type2_at_time,
         cump_type3_at_time = cump_type2_at_time + p_type3_at_time,
         #* Get the following time for using geom_rect
         next_time = lead(time)) %>%

pal <- brewer.pal(4, "PRGn")
ggplot(SurvDataSummary,
       aes(xmin = time,
           xmax = next_time)) + 
  geom_rect(aes(ymin = 0, ymax = p_alive_at_time), fill = pal[1]) + 
  geom_rect(aes(ymin = p_alive_at_time, ymax = cump_type1_at_time), fill = pal[2]) + 
  geom_rect(aes(ymin = cump_type1_at_time, ymax = cump_type2_at_time), fill = pal[3]) + 
  geom_rect(aes(ymin = cump_type2_at_time, ymax = cump_type3_at_time), fill = pal[4])

enter image description here