如何在R中按组设置间隔

时间:2017-05-18 21:46:33

标签: r dplyr

我有一个数据框,其中包含日期,客户编号和他们访问过的第n次。

   Business_Date Cust_ID visit_number
1     2016-11-03       1            1
2     2016-11-20       1            2
3     2016-12-27       1            3
4     2016-11-03       2            1
5     2016-11-04       2            2
6     2016-11-10       2            3
7     2016-11-11       2            4
8     2016-11-19       2            5
9     2016-12-16       2            6
10    2017-01-16       2            1
11    2016-11-17       3            1
12    2016-11-17       3            2
13    2016-11-10       4            1
14    2016-11-12       4            2
15    2016-11-16       4            3
16    2016-11-17       4            4
17    2016-11-20       4            5
18    2016-12-02       4            6

structure(list(Business_Date = structure(c(17108, 17125, 17162, 
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122, 
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1", 
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4", 
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5, 
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID", 
"visit_number"), row.names = c(NA, -18L), class = "data.frame")

我想创建一个名为cycle的新列,每隔5次访问就会对访问次数进行分类。

       Business_Date Cust_ID visit_number cycle
1      11/3/2016       1            1     1
2     11/20/2016       1            2     1
3     12/27/2016       1            3     1
4      11/3/2016       2            1     1
5      11/4/2016       2            2     1
6     11/10/2016       2            3     1
7     11/11/2016       2            4     1
8     11/19/2016       2            5     1
9     12/16/2016       2            6     2
10     1/16/2017       2            1     1
11    11/17/2016       3            1     1
12    11/17/2016       3            2     1
13    11/10/2016       4            1     1
14    11/12/2016       4            2     1
15    11/16/2016       4            3     1
16    11/17/2016       4            4     1
17    11/20/2016       4            5     1
18     12/2/2016       4            6     2

我会使用cut函数,但理论上这些周期可以跨越正无穷大。

2 个答案:

答案 0 :(得分:1)

你可以除以6,取商,然后加1(因为否则它从0开始)。 使用dplyr解决方案。

library(tidyverse)    
df_Cycle <- df %>% group_by(Cust_ID) %>% 
  mutate(cycle = visit_number %/% 6 +1)

提供以下数据帧:

# A tibble: 18 x 4
# Groups:   Cust_ID [4]
   Business_Date Cust_ID visit_number cycle
           <chr>   <int>        <int> <dbl>
 1    2016-11-03       1            1     1
 2    2016-11-20       1            2     1
 3    2016-12-27       1            3     1
 4    2016-11-03       2            1     1
 5    2016-11-04       2            2     1
 6    2016-11-10       2            3     1
 7    2016-11-11       2            4     1
 8    2016-11-19       2            5     1
 9    2016-12-16       2            6     2
10    2017-01-16       2            1     1
11    2016-11-17       3            1     1
12    2016-11-17       3            2     1
13    2016-11-10       4            1     1
14    2016-11-12       4            2     1
15    2016-11-16       4            3     1
16    2016-11-17       4            4     1
17    2016-11-20       4            5     1
18    2016-12-02       4            6     2

答案 1 :(得分:0)

请检查您的数据输入,此答案取决于您所需的输出。(对于第10行,visit_number是7还是1?)

join