我有一个数据框,其中包含日期,客户编号和他们访问过的第n次。
Business_Date Cust_ID visit_number
1 2016-11-03 1 1
2 2016-11-20 1 2
3 2016-12-27 1 3
4 2016-11-03 2 1
5 2016-11-04 2 2
6 2016-11-10 2 3
7 2016-11-11 2 4
8 2016-11-19 2 5
9 2016-12-16 2 6
10 2017-01-16 2 1
11 2016-11-17 3 1
12 2016-11-17 3 2
13 2016-11-10 4 1
14 2016-11-12 4 2
15 2016-11-16 4 3
16 2016-11-17 4 4
17 2016-11-20 4 5
18 2016-12-02 4 6
structure(list(Business_Date = structure(c(17108, 17125, 17162,
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122,
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1",
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4",
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5,
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID",
"visit_number"), row.names = c(NA, -18L), class = "data.frame")
我想创建一个名为cycle的新列,每隔5次访问就会对访问次数进行分类。
Business_Date Cust_ID visit_number cycle
1 11/3/2016 1 1 1
2 11/20/2016 1 2 1
3 12/27/2016 1 3 1
4 11/3/2016 2 1 1
5 11/4/2016 2 2 1
6 11/10/2016 2 3 1
7 11/11/2016 2 4 1
8 11/19/2016 2 5 1
9 12/16/2016 2 6 2
10 1/16/2017 2 1 1
11 11/17/2016 3 1 1
12 11/17/2016 3 2 1
13 11/10/2016 4 1 1
14 11/12/2016 4 2 1
15 11/16/2016 4 3 1
16 11/17/2016 4 4 1
17 11/20/2016 4 5 1
18 12/2/2016 4 6 2
我会使用cut函数,但理论上这些周期可以跨越正无穷大。
答案 0 :(得分:1)
你可以除以6,取商,然后加1(因为否则它从0开始)。 使用dplyr解决方案。
library(tidyverse)
df_Cycle <- df %>% group_by(Cust_ID) %>%
mutate(cycle = visit_number %/% 6 +1)
提供以下数据帧:
# A tibble: 18 x 4
# Groups: Cust_ID [4]
Business_Date Cust_ID visit_number cycle
<chr> <int> <int> <dbl>
1 2016-11-03 1 1 1
2 2016-11-20 1 2 1
3 2016-12-27 1 3 1
4 2016-11-03 2 1 1
5 2016-11-04 2 2 1
6 2016-11-10 2 3 1
7 2016-11-11 2 4 1
8 2016-11-19 2 5 1
9 2016-12-16 2 6 2
10 2017-01-16 2 1 1
11 2016-11-17 3 1 1
12 2016-11-17 3 2 1
13 2016-11-10 4 1 1
14 2016-11-12 4 2 1
15 2016-11-16 4 3 1
16 2016-11-17 4 4 1
17 2016-11-20 4 5 1
18 2016-12-02 4 6 2
答案 1 :(得分:0)
请检查您的数据输入,此答案取决于您所需的输出。(对于第10行,visit_number是7还是1?)
join