customer_id purchase_date Difference between purchases (in days)
= WHAT I AM TRYING TO GET
1 23/04/2017 0 (first )row
1 24/04/2017 1
1 01/01/2018 252
2 03/05/2017 0 (this is a new customer)
2 10/05/2017 7
我想计算同一客户的两次购买之间的时差。我尝试使用LAG和LEAD函数,但是我得到了语法错误,我不明白
现在我一直这样做:
SELECT customer_id, purchase_date
case
when lag(purchase_date,1,0) over (partition by customer_id order by
purchase_date) = 0 then 0
ELSE purchase_date -lag(purchase_date,1,0) over(partition by customer_id order
by purchase_date)
end
FROM
Table1
它给我一个语法错误,我在第一个" over"
之后才明白答案 0 :(得分:1)
当前的MySQL版本不支持LAG和LEAD等窗口函数。
MySQL 8.0+现在是候选版本,它将支持窗口函数,但不支持生产就绪。
在当前的MySQL版本中,您可以使用MySQL的用户变量或与共同相关的子查询来模拟LAG。
创建表/插入数据
CREATE TABLE Table1
(`customer_id` int, `purchase_date` varchar(10))
;
INSERT INTO Table1
(`customer_id`, `purchase_date`)
VALUES
(1, '23/04/2017'),
(1, '24/04/2017'),
(1, '01/01/2018'),
(2, '03/05/2017'),
(2, '10/05/2017')
;
MySQL用户变量的技巧是正确初始化它们。
<强>查询强>
SELECT
*
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
<强>结果强>
| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|------------------------|--------------------------|
| 1 | 23/04/2017 | (null) | (null) | 1 | 23/04/2017 |
| 1 | 24/04/2017 | (null) | (null) | 1 | 24/04/2017 |
| 1 | 01/01/2018 | (null) | (null) | 1 | 01/01/2018 |
| 2 | 03/05/2017 | (null) | (null) | 2 | 03/05/2017 |
| 2 | 10/05/2017 | (null) | (null) | 2 | 10/05/2017 |
现在您可以添加计算部分。
请记住,在MySQL的用户变量初始化之前,需要完成计算。
所以MySQL用户变量具有前一列值。
<强>查询强>
SELECT
*
, (
CASE
WHEN (@customer_id = Table1.customer_id)
THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
END
) AS diff
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
ORDER BY
STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC
注意强>
我使用STR_TO_DATE函数将基于varchar的dateformat格式化为datetime格式。
如果您的日期列已经是日期数据类型,则可以删除该函数并改为使用THEN DATEDIFF(purchase_date, @purchase_date)
。
<强>结果强>
| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL | diff | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|--------|------------------------|--------------------------|
| 1 | 23/04/2017 | (null) | (null) | (null) | 1 | 23/04/2017 |
| 1 | 24/04/2017 | (null) | (null) | 1 | 1 | 24/04/2017 |
| 1 | 01/01/2018 | (null) | (null) | 252 | 1 | 01/01/2018 |
| 2 | 03/05/2017 | (null) | (null) | (null) | 2 | 03/05/2017 |
| 2 | 10/05/2017 | (null) | (null) | 7 | 2 | 10/05/2017 |
现在只需选择正确的列。
<强>查询强>
SELECT
Table1_user_params.customer_id
, Table1_user_params.purchase_date
, (
CASE
WHEN Table1_user_params.diff IS NULL
THEN 0
ELSE Table1_user_params.diff
END
)
AS diff
FROM (
SELECT
*
, (
CASE
WHEN (@customer_id = Table1.customer_id)
THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
END
) AS diff
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
ORDER BY
STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC
)
AS Table1_user_params
<强>结果强>
| customer_id | purchase_date | diff |
|-------------|---------------|------|
| 1 | 23/04/2017 | 0 |
| 1 | 24/04/2017 | 1 |
| 1 | 01/01/2018 | 252 |
| 2 | 03/05/2017 | 0 |
| 2 | 10/05/2017 | 7 |