如何使用危害模型预测R中的下一个事件日期

时间:2019-08-17 13:53:16

标签: r forecasting survival-analysis cox-regression

我有这种格式的数据

lid=structure(list(data_user_create.order = structure(c(90L, 94L, 
98L, 102L, 106L, 110L, 114L, 118L, 1L, 5L, 9L, 13L, 17L, 21L, 
25L, 29L, 33L, 37L, 41L, 45L, 49L, 53L, 57L, 61L, 65L, 69L, 73L, 
77L, 81L, 84L, 87L, 91L, 95L, 99L, 103L, 107L, 111L, 115L, 2L, 
6L, 10L, 14L, 18L, 22L, 26L, 30L, 34L, 38L, 42L, 46L, 50L, 54L, 
58L, 62L, 66L, 70L, 74L, 78L, 82L, 85L, 88L, 92L, 96L, 100L, 
104L, 108L, 112L, 116L, 119L, 3L, 7L, 11L, 15L, 19L, 23L, 27L, 
31L, 35L, 39L, 43L, 47L, 51L, 55L, 59L, 63L, 67L, 71L, 75L, 79L, 
83L, 86L, 89L, 93L, 97L, 101L, 105L, 109L, 113L, 117L, 4L, 8L, 
12L, 16L, 20L, 24L, 28L, 32L, 36L, 40L, 44L, 48L, 52L, 56L, 60L, 
64L, 68L, 72L, 76L, 80L), .Label = c("01.09.2017 10:26:25", "01.10.2017 10:26:25", 
"01.11.2017 10:26:56", "01.12.2017 10:27:26", "02.09.2017 10:26:25", 
"02.10.2017 10:26:26", "02.11.2017 10:26:57", "02.12.2017 10:27:27", 
"03.09.2017 10:26:25", "03.10.2017 10:26:27", "03.11.2017 10:26:58", 
"03.12.2017 10:27:28", "04.09.2017 10:26:25", "04.10.2017 10:26:28", 
"04.11.2017 10:26:59", "04.12.2017 10:27:29", "05.09.2017 10:26:25", 
"05.10.2017 10:26:29", "05.11.2017 10:27:00", "05.12.2017 10:27:30", 
"06.09.2017 10:26:25", "06.10.2017 10:26:30", "06.11.2017 10:27:01", 
"06.12.2017 10:27:31", "07.09.2017 10:26:25", "07.10.2017 10:26:31", 
"07.11.2017 10:27:02", "07.12.2017 10:27:32", "08.09.2017 10:26:25", 
"08.10.2017 10:26:32", "08.11.2017 10:27:03", "08.12.2017 10:27:33", 
"09.09.2017 10:26:25", "09.10.2017 10:26:33", "09.11.2017 10:27:04", 
"09.12.2017 10:27:34", "10.09.2017 10:26:25", "10.10.2017 10:26:34", 
"10.11.2017 10:27:05", "10.12.2017 10:27:35", "11.09.2017 10:26:25", 
"11.10.2017 10:26:35", "11.11.2017 10:27:06", "11.12.2017 10:27:36", 
"12.09.2017 10:26:25", "12.10.2017 10:26:36", "12.11.2017 10:27:07", 
"12.12.2017 10:27:37", "13.09.2017 10:26:25", "13.10.2017 10:26:37", 
"13.11.2017 10:27:08", "13.12.2017 10:27:38", "14.09.2017 10:26:25", 
"14.10.2017 10:26:38", "14.11.2017 10:27:09", "14.12.2017 10:27:39", 
"15.09.2017 10:26:25", "15.10.2017 10:26:39", "15.11.2017 10:27:10", 
"15.12.2017 10:27:40", "16.09.2017 10:26:25", "16.10.2017 10:26:40", 
"16.11.2017 10:27:11", "16.12.2017 10:27:41", "17.09.2017 10:26:25", 
"17.10.2017 10:26:41", "17.11.2017 10:27:12", "17.12.2017 10:27:42", 
"18.09.2017 10:26:25", "18.10.2017 10:26:42", "18.11.2017 10:27:13", 
"18.12.2017 10:27:43", "19.09.2017 10:26:25", "19.10.2017 10:26:43", 
"19.11.2017 10:27:14", "19.12.2017 10:27:44", "20.09.2017 10:26:25", 
"20.10.2017 10:26:44", "20.11.2017 10:27:15", "20.12.2017 10:27:45", 
"21.09.2017 10:26:25", "21.10.2017 10:26:45", "21.11.2017 10:27:16", 
"22.09.2017 10:26:25", "22.10.2017 10:26:46", "22.11.2017 10:27:17", 
"23.09.2017 10:26:25", "23.10.2017 10:26:47", "23.11.2017 10:27:18", 
"24.08.2017 10:26:25", "24.09.2017 10:26:25", "24.10.2017 10:26:48", 
"24.11.2017 10:27:19", "25.08.2017 10:26:25", "25.09.2017 10:26:25", 
"25.10.2017 10:26:49", "25.11.2017 10:27:20", "26.08.2017 10:26:25", 
"26.09.2017 10:26:25", "26.10.2017 10:26:50", "26.11.2017 10:27:21", 
"27.08.2017 10:26:25", "27.09.2017 10:26:25", "27.10.2017 10:26:51", 
"27.11.2017 10:27:22", "28.08.2017 10:26:25", "28.09.2017 10:26:25", 
"28.10.2017 10:26:52", "28.11.2017 10:27:23", "29.08.2017 10:26:25", 
"29.09.2017 10:26:25", "29.10.2017 10:26:53", "29.11.2017 10:27:24", 
"30.08.2017 10:26:25", "30.09.2017 10:26:25", "30.10.2017 10:26:54", 
"30.11.2017 10:27:25", "31.08.2017 10:26:25", "31.10.2017 10:26:55"
), class = "factor"), goods = structure(c(2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("Ecoslim", "Hammer"), class = "factor"), 
    id_users = 1:119, event1 = c(0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 
    0L, 1L, 1L, 0L, 1L, 0L, 1L), attemps.to.get.events = 21:139), .Names = c("data_user_create.order", 
"goods", "id_users", "event1", "attemps.to.get.events"), class = "data.frame", row.names = c(NA, 
-119L))

data_user_create_create.order是某人在商店goods中创建订单的商品类型 id_users is id_users它并不重要。不同的人可以同时订购相同的产品。

event1 - 0 event wasn't, 1 event was.

attemps to get events是度量协变量,它指示在事件发生或未发生之前进行了多少次尝试。 因此,我必须为每种商品创建预测,以dd-mm-yyyy-hh-mm-ss

格式预测事件的日期

我的尝试

library("survival")
fit <- survfit(Surv(data_user_create.order, event1) ~ goods, data = mydat)
ggsurvplot(fit)

和错误

Error in Surv(data_user_create.order, event1) : 
  Time variable is not numeric

当我尝试执行Cox回归时

coxph <- coxph(Surv(mydat$data_user_create.order,mydat$event1) ~ mydat$goods, 
               method="breslow")
summary(coxph)

相同错误

Error in Surv(mydat$data_user_create.order, mydat$event1) : 
  Time variable is not numeric

如何预测事件发生的下一个日期 每种商品的格式均为dd-mm-yyyy-hh-mm-ss

即期望的输出

  date.person.creation type.of.goods event
1  24.08.2017 11:04:02       Ecoslim     1
2  24.08.2017 11:04:38       Ecoslim     1
3  24.08.2017 11:05:45       Ecoslim     0
4  24.08.2017 11:04:02        pillow     0
5  24.08.2017 11:04:38        pillow     0
6  24.08.2017 11:05:45        pillow     0

如果更改日期格式,将会出现相同的错误。我认为问题不在日期内。

0 个答案:

没有答案