我有以下数据框:
id day total_amount
1 2015-07-09 1000
1 2015-10-22 100
1 2015-11-12 200
1 2015-11-27 2392
1 2015-12-16 123
6 2015-07-09 200
7 2015-07-09 1000
7 2015-08-27 100018
7 2015-11-25 1000
8 2015-08-27 1000
8 2015-12-07 10000
8 2016-01-18 796
8 2016-03-31 10000
15 2015-09-10 1500
15 2015-09-30 1000
我需要在day列中每两个连续时间减去它们是否具有相同的id,直到到达该id的最后一行然后开始减去day列中的时间这一次为新id,类似于输出中的后续行的东西是预期的:
7 2015-07-09 1000 2015-08-27 - 2015-07-09
7 2015-08-27 100018 2015-07-09 - 2015-08-27
7 2015-07-09 1000 0
8 2015-08-27 1000 2015-12-07 - 2015-08-27
8 2015-12-07 10000 2016-01-18 - 2015-12-07
8 2016-01-18 796 2016-03-31 - 2016-01-18
8 2016-03-31 10000 0
15 2015-09-10 1000 2015-09-30 - 2015-09-10
15 2015-09-30 1000 2015-10-01 - 2015-09-30
15 2015-10-01 1000
答案 0 :(得分:1)
要获得您可以尝试的天数差异:
library(dplyr)
group_by(df, id) %>% mutate(new = as.Date(lead(day)) - as.Date(day))
Source: local data frame [15 x 4]
Groups: id [5]
id day total_amount new
(int) (fctr) (int) (dfft)
1 1 2015-07-09 1000 105 days
2 1 2015-10-22 100 21 days
3 1 2015-11-12 200 15 days
4 1 2015-11-27 2392 19 days
5 1 2015-12-16 123 NA days
6 6 2015-07-09 200 NA days
7 7 2015-07-09 1000 49 days
8 7 2015-08-27 100018 90 days
9 7 2015-11-25 1000 NA days
10 8 2015-08-27 1000 102 days
11 8 2015-12-07 10000 42 days
12 8 2016-01-18 796 73 days
13 8 2016-03-31 10000 NA days
14 15 2015-09-10 1500 20 days
15 15 2015-09-30 1000 NA days
<强> EDITED 强>
要从您当前可以使用的日期中减去最后一个日期:
# First save the above result as `df1`:
df1[is.na(df1["new"]), "new"] <- as.Date(unlist(df1[is.na(df1["new"]), "day"]))
- Sys.Date()