我已经开发了这个SQL脚本,可以在30天的窗口中执行Group By:
SELECT MIN(id.customer_id),
DATEADD(DAY, DATEDIFF(DAY, 0, e.transaction_datetime) / 30 * 30, 0) as [window_start_dt],
DATEADD(DAY, DATEDIFF(DAY, -30, e.transaction_datetime) / 30 * 30, 0) as [window_end_dt],
FROM event as e
INNER JOIN customer_identity as id
ON id.customer_id = e.customer_id
WHERE e.transaction_datetime BETWEEN '2003-01-06' AND '2017-12-31'
GROUP BY id.customer_id,
DATEADD(DAY, DATEDIFF(DAY, 0, e.transaction_datetime) / 30 * 30, 0),
DATEADD(DAY, DATEDIFF(DAY, -30, e.transaction_datetime) / 30 * 30, 0)
ORDER BY [window_start_dt], [window_end_dt]
结果如下:
customer_id,window_start_dt,window_end_dt
1,2003-01-06,2003-02-05
然而,这不是我想要的30天窗口:
customer_id,window_start_dt,window_end_dt
1,2003-01-06,2003-02-05
所以我的问题是window_end_dt关闭了。目前,我在DATEDIFF中使用了-30,这有点奇怪,所以欢迎更好地使用window_end_dt。
修改
这是至少第一个月的小示例数据集:
customer_id, transaction_datetime
1, 2013-02-04
1, 2013-01-21
1, 2013-01-22
1, 2013-01-27
2, 2013-02-02
2, 2013-01-08
2, 2013-01-19
2, 2013-01-21
3, 2013-02-03
3, 2013-01-15
3, 2013-01-19
此外,我希望能够在可能的情况下选择窗口的任意开始日期(不与月份对齐)。例如,我想在2003-01-06上理想地启动Windows。
修改
我做了更改,以反映30天窗口所需的2003-01-06开始日期并避免混淆。我正在这些窗口中计算其他列。但是我已经删除它们以保持简单并专注于组别的日期逻辑。
答案 0 :(得分:1)
select customer_id,
DATEADD(DAY,(DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30)*30,'2013-01-06') window_start,
DATEADD(DAY,(DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30)*30+29,'2013-01-06') window_end
from event
where transaction_datetime>='2013-01-06'
group by customer_id,DATEDIFF(DAY,'2013-01-06',transaction_datetime)/30
在sqlfiddle.com上进行测试
答案 1 :(得分:0)
我对最终目标感到困惑,但你可以用cte制作你的跑步日期窗口然后加入它...
declare @table table (customer_id int, transaction_datetime datetime)
insert into @table
values
(1, '2013-01-06'),
(1, '2013-01-21'),
(1, '2013-01-22'),
(1, '2013-01-27'),
(2, '2013-01-02'),
(2, '2013-01-08'),
(2, '2013-01-19'),
(2, '2013-01-21'),
(3, '2013-01-27'),
(3, '2013-01-15'),
(3, '2013-01-19'),
(3, '2013-02-19'), --I the following 3 rows this to show where id could fall in multiple windows
(3, '2013-03-14'),
(3, '2013-01-29')
declare @startDate date = '20130101'
declare @endDate date = (select max(transaction_datetime ) from @table)
;with dates as(
select TheDate = @startDate
union all
select TheDate = dateadd(day,30,TheDate)
from dates
where TheDate <= @endDate
)
select distinct
customer_id
,StartWindow = TheDate
,EndWindow = dateadd(day,29,TheDate)
from @table
inner join dates on
transaction_datetime between TheDate and dateadd(day,29,TheDate)
option (maxrecursion 0)
所以你的数据......也许是这样......
+-------------+-------------+------------+
| customer_id | StartWindow | EndWindow |
+-------------+-------------+------------+
| 1 | 2013-01-01 | 2013-01-30 |
| 2 | 2013-01-01 | 2013-01-30 |
| 3 | 2013-01-01 | 2013-01-30 |
| 3 | 2013-01-31 | 2013-03-01 |
| 3 | 2013-03-02 | 2013-03-31 |
+-------------+-------------+------------+
<强>返回强>
C:\Users\user\Desktop\foo.exe foo://action/bar