如何根据特定类型记录以来的天数计算新变量

时间:2017-06-01 10:57:28

标签: r date dataframe recode

我正在尝试创建一个变量,显示自特定事件发生以来的天数。这是this previous question的后续内容,使用相同的数据。

数据如下所示(注意日期为DD-MM-YYYY格式):

ID  date      drug  score
A   28/08/2016  2   3
A   29/08/2016  1   4
A   30/08/2016  2   4
A   2/09/2016   2   4
A   3/09/2016   1   4
A   4/09/2016   2   4
B   8/08/2016   1   3
B   9/08/2016   2   4
B   10/08/2016  2   3
B   11/08/2016  1   3
C   30/11/2016  2   4
C   2/12/2016   1   5
C   3/12/2016   2   1
C   5/12/2016   1   4
C   6/12/2016   2   4
C   8/12/2016   1   2
C   9/12/2016   1   2 

对于'药物':1 =服用药物,2 =没服用药物。

每当药物的价值为1时​​,如果该ID的前一个记录也是药物== 1,那么我需要生成一个新值'lagtime',显示天数(不是行数)自从上一次服用药物以来。

所以我要找的输出是:

ID  date      drug  score  lagtime
A   28/08/2016  2   3
A   29/08/2016  1   4
A   30/08/2016  2   4
A   2/09/2016   2   4
A   3/09/2016   1   4      5
A   4/09/2016   2   4
B   8/08/2016   1   3
B   9/08/2016   2   4
B   10/08/2016  2   3
B   11/08/2016  1   3      3
C   30/11/2016  2   4
C   2/12/2016   1   5
C   3/12/2016   2   1
C   5/12/2016   1   4      3
C   6/12/2016   2   4
C   8/12/2016   1   2      3
C   9/12/2016   1   2      1

所以我需要一种方法来生成(mutate?)这个滞后时间分数,该分数计算为每种药物的日期== 1记录,减去前一种药物的日期== 1记录,按ID分组。 这让我完全变成了竹子。

以下是示例数据的代码:

data<-data.frame(ID=c("A","A","A","A","A","A","B","B","B","B","C","C","C","C","C","C","C"),
                 date=as.Date(c("28/08/2016","29/08/2016","30/08/2016","2/09/2016","3/09/2016","4/09/2016","8/08/2016","9/08/2016","10/08/2016","11/08/2016","30/11/2016","2/12/2016","3/12/2016","5/12/2016","6/12/2016","8/12/2016","9/12/2016"),format= "%d/%m/%Y"),
                 drug=c(2,1,2,2,1,2,1,2,2,1,2,1,2,1,2,1,1),
                 score=c(3,4,4,4,4,4,3,4,3,3,4,5,1,4,4,2,2))

1 个答案:

答案 0 :(得分:2)

我们可以使用data.table。将'data.frame'转换为'data.table'(setDT(data)),按'ID'分组,指定idrug ==1),获取'date'的差异( diff(date)),与NA连接,因为diff输出长度比原始向量小1,转换为integer并指定(:=)以创建'时差'。默认情况下,所有其他值都为NA

library(data.table)
setDT(data)[drug==1, lagtime := as.integer(c(NA, diff(date))), ID]
data
#    ID       date drug score lagtime
# 1:  A 2016-08-28    2     3      NA
# 2:  A 2016-08-29    1     4      NA
# 3:  A 2016-08-30    2     4      NA
# 4:  A 2016-09-02    2     4      NA
# 5:  A 2016-09-03    1     4       5
# 6:  A 2016-09-04    2     4      NA
# 7:  B 2016-08-08    1     3      NA
# 8:  B 2016-08-09    2     4      NA
# 9:  B 2016-08-10    2     3      NA
#10:  B 2016-08-11    1     3       3
#11:  C 2016-11-30    2     4      NA
#12:  C 2016-12-02    1     5      NA
#13:  C 2016-12-03    2     1      NA
#14:  C 2016-12-05    1     4       3
#15:  C 2016-12-06    2     4      NA
#16:  C 2016-12-08    1     2       3
#17:  C 2016-12-09    1     2       1