我试图找到每个Consumer.Identity
的记录之间的时间差。例如,每个Consumer.Identity
都是独立的。只应为每个唯一ID计算访问次数之间的差异。
示例数据:
tail(cs1[,c('Other', 'Consumer.Identity', 'timestamp')], 20)
Other Consumer.Identity timestamp
8830 672 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-05-29 19:15:00
8838 672 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-05-29 19:45:00
8788 674 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-05-30 13:26:00
12102 665 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-06 18:29:00
11749 663 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-09 08:15:00
11761 663 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-09 08:48:00
11696 663 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-09 14:12:00
11819 663 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-10 08:23:00
11912 663 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-10 16:13:00
13188 673 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-13 18:24:00
14235 667 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-16 15:24:00
14812 673 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-18 16:03:00
20523 650 ff98e12f-24fc-4072-adba-fd15c9481a84 2014-06-26 10:27:00
17856 657 ffa6dab4-361a-4ef0-8e23-53cd6084d01e 2015-01-07 22:59:00
18051 657 ffa6dab4-361a-4ef0-8e23-53cd6084d01e 2015-01-08 08:53:00
25860 657 ffab2368-3b2e-4ee3-9352-5c6520cf81b1 2014-07-30 15:27:00
17163 673 ffab2368-3b2e-4ee3-9352-5c6520cf81b1 2015-01-06 18:21:00
53407 670 ffc3af0b-f3ee-4ca7-a1db-4a9a1f1cf58d 2014-09-15 17:41:00
76334 667 fff9593f-3038-4986-9792-0960fdd87a1b 2014-08-13 17:01:00
41457 667 fff9593f-3038-4986-9792-0960fdd87a1b 2014-08-18 16:48:00
以下是我的代码。我想创建一个名为gap
的单独字段。我还对使用lag()
还是diff()
cs1 %>%
arrange(Consumer.Identity, timestamp) %>%
group_by(Consumer.Identity) %>%
mutate(gap = timestamp - lag(timestamp)) %>%
group_by(Consumer.Identity) %>%
mutate(gap = ifelse(row_number() == 1, NA, gap)) # first row of group is NA