我想计算自main
以来累计的累计天数。是否可以使用event==1
?
期望的结果:
data.table
编辑(12/12/17):尝试使用@ Psidom的解决方案。
解决方案需要对 id date event passed
1: A 2000-01-13 1 0
2: A 2000-01-18 0 5
3: A 2000-01-25 0 12
4: A 2000-01-31 1 0
5: B 2012-10-10 1 0
6: B 2012-10-11 0 1
7: B 2012-10-14 1 0
8: B 2012-10-15 0 1
9: C 2005-07-25 1 0
10: C 2005-07-31 0 6
df <- data.table(
id = c("A", "A", "A", "A",
"B", "B", "B", "B",
"C", "C"),
date = c("2000-01-13", "2000-01-18", "2000-01-25", "2000-01-31", # A
"2012-10-10", "2012-10-11", "2012-10-14", "2012-10-15", # B
"2005-07-25", "2005-07-31"), # C
event = c(1, 0, 0, 0,
0, 0, 1, 0,
1, 0)
)
和id
进行排序,这不是问题。然而,注意到第6行:计算了一天,虽然这应该是0,因为该组尚未发生任何事件。
date
答案 0 :(得分:3)
如果event
列仅包含0
和1
,您可以通过cumsum(event)
创建一个组变量,只要event
为{1
,就会创建一个新组{1}};然后按此新变量分组,计算累计天数:
df[, days_from_start := cumsum(c(0, diff(as.Date(date)))), by = cumsum(event)]
# ^^^^^^^^^^^^^
df
# id date event days_from_start
# 1: A 2000-01-13 1 0
# 2: A 2000-01-18 0 5
# 3: A 2000-01-25 0 12
# 4: A 2000-01-31 1 0
# 5: B 2012-10-10 1 0
# 6: B 2012-10-11 0 1
# 7: B 2012-10-14 1 0
# 8: B 2012-10-15 0 1
# 9: C 2005-07-25 1 0
#10: C 2005-07-31 0 6