SQL Server:分组30天Windows

时间:2017-11-17 18:04:19

标签: sql sql-server group-by

我已经开发了这个SQL脚本,可以在30天的窗口中执行Group By:

SELECT MIN(id.customer_id),
       DATEADD(DAY, DATEDIFF(DAY, 0, e.transaction_datetime) / 30 * 30, 0) as [window_start_dt],
       DATEADD(DAY, DATEDIFF(DAY, -30, e.transaction_datetime) / 30 * 30, 0) as [window_end_dt],
FROM event as e
INNER JOIN customer_identity as id
      ON id.customer_id = e.customer_id
WHERE e.transaction_datetime BETWEEN '2003-01-06' AND '2017-12-31'
GROUP BY id.customer_id, 
      DATEADD(DAY, DATEDIFF(DAY, 0, e.transaction_datetime) / 30 * 30, 0),
      DATEADD(DAY, DATEDIFF(DAY, -30, e.transaction_datetime) / 30 * 30, 0)    
ORDER BY [window_start_dt], [window_end_dt]

结果如下:

customer_id,window_start_dt,window_end_dt
1,2003-01-06,2003-02-05 

然而,这不是我想要的30天窗口:

customer_id,window_start_dt,window_end_dt
1,2003-01-06,2003-02-05

所以我的问题是window_end_dt关闭了。目前,我在DATEDIFF中使用了-30,这有点奇怪,所以欢迎更好地使用window_end_dt。

修改

这是至少第一个月的小示例数据集:

customer_id, transaction_datetime
1, 2013-02-04
1, 2013-01-21
1, 2013-01-22
1, 2013-01-27
2, 2013-02-02
2, 2013-01-08
2, 2013-01-19
2, 2013-01-21
3, 2013-02-03
3, 2013-01-15
3, 2013-01-19

此外,我希望能够在可能的情况下选择窗口的任意开始日期(不与月份对齐)。例如,我想在2003-01-06上理想地启动Windows。

修改

我做了更改,以反映30天窗口所需的2003-01-06开始日期并避免混淆。我正在这些窗口中计算其他列。但是我已经删除它们以保持简单并专注于组别的日期逻辑。

2 个答案:

答案 0 :(得分:1)

select customer_id,
       DATEADD(DAY,(DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30)*30,'2013-01-06') window_start,
       DATEADD(DAY,(DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30)*30+29,'2013-01-06') window_end
  from event
 where transaction_datetime>='2013-01-06'
 group by customer_id,DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30

sqlfiddle.com上进行测试

答案 1 :(得分:0)

我对最终目标感到困惑,但你可以用cte制作你的跑步日期窗口然后加入它...

declare @table table (customer_id int, transaction_datetime datetime)
insert into @table
values
(1, '2013-01-06'),
(1, '2013-01-21'),
(1, '2013-01-22'),
(1, '2013-01-27'),
(2, '2013-01-02'),
(2, '2013-01-08'),
(2, '2013-01-19'),
(2, '2013-01-21'),
(3, '2013-01-27'),
(3, '2013-01-15'),
(3, '2013-01-19'),
(3, '2013-02-19'),    --I the following 3 rows this to show where id could fall in multiple windows
(3, '2013-03-14'),
(3, '2013-01-29')


declare @startDate date = '20130101'
declare @endDate date = (select max(transaction_datetime ) from @table)

;with dates as(
    select TheDate = @startDate
    union all
    select TheDate = dateadd(day,30,TheDate)
    from dates
    where TheDate <= @endDate
)

select distinct
    customer_id
    ,StartWindow = TheDate 
    ,EndWindow = dateadd(day,29,TheDate)
from @table
    inner join dates on 
    transaction_datetime between TheDate and dateadd(day,29,TheDate)
option (maxrecursion 0)

所以你的数据......也许是这样......

+-------------+-------------+------------+
| customer_id | StartWindow | EndWindow  |
+-------------+-------------+------------+
|           1 | 2013-01-01  | 2013-01-30 |
|           2 | 2013-01-01  | 2013-01-30 |
|           3 | 2013-01-01  | 2013-01-30 |
|           3 | 2013-01-31  | 2013-03-01 |
|           3 | 2013-03-02  | 2013-03-31 |
+-------------+-------------+------------+

<强>返回

C:\Users\user\Desktop\foo.exe foo://action/bar