R中的日期范围为1或0的系列

时间:2015-06-08 20:22:57

标签: r date date-range

我有药房索赔数据,按病人列出开始和结束填写日期。为了便于计算,我想记录一个真实的(1)或假(0)日记,记录每个患者是否有某一天记录的日期。

使用下面的示例数据,我试图分析在1/1 / 2013-1 / 10/2013的规定十天期间的观察结果。

我玩过?seqdate

数据

Patient_ID  Start_Date  End_Date  
a           1/1/2013    1/3/2013  
b           1/3/2013    1/8/2013  
c           1/1/2013    1/10/2013  
d           1/7/2013    1/9/2013
a           1/8/2013    1/9/2013

期望输出(长格式)

            a   b   c   d  
1/1/2013    1   0   1   0  
1/2/2013    1   0   1   0  
1/3/2013    1   1   1   0  
1/4/2013    0   1   1   0  
1/5/2013    0   1   1   0  
1/6/2013    0   1   1   0  
1/7/2013    0   1   1   1  
1/8/2013    1   1   1   1  
1/9/2013    1   0   1   1  
1/10/2013   0   0   1   0  

1 个答案:

答案 0 :(得分:5)

尝试

library(data.table)
res <- setDT(df1)[, seq(as.Date(Start_Date, '%m/%d/%Y'),
    as.Date(End_Date, '%m/%d/%Y'), by='day'), by=list(Patient_ID, 
       1:nrow(df1))]
table(res[,c(3,1), with=FALSE])

或仅使用base R

 lst <- Map(seq, as.Date(df1$Start_Date, '%m/%d/%Y'), 
        as.Date(df1$End_Date, '%m/%d/%Y'), by='day') 
 lst <- lapply(lst, format, '%m/%d/%Y')
 table(unlist(lst), rep(df1$Patient_ID,lengths(lst)))
 #            a b c d
 # 01/01/2013 1 0 1 0
 # 01/02/2013 1 0 1 0
 # 01/03/2013 1 1 1 0
 # 01/04/2013 0 1 1 0
 # 01/05/2013 0 1 1 0
 # 01/06/2013 0 1 1 0
 # 01/07/2013 0 1 1 1
 # 01/08/2013 1 1 1 1
 # 01/09/2013 1 0 1 1
 # 01/10/2013 0 0 1 0

数据

 df1 <- structure(list(Patient_ID = c("a", "b", "c", "d", "a"), 
 Start_Date = c("1/1/2013", 
 "1/3/2013", "1/1/2013", "1/7/2013", "1/8/2013"), End_Date =
 c("1/3/2013",  
 "1/8/2013", "1/10/2013", "1/9/2013", "1/9/2013")), 
 .Names = c("Patient_ID", 
 "Start_Date", "End_Date"), class = "data.frame",
  row.names = c(NA, -5L))