我有一个表格,显示customer_id,product_id,browse_date,purchase_date以及浏览和购买日期之间的差异。它看起来像这样。
id pID b_Date p_Date
1 001 7/20/2014 7/20/2014
1 001 7/20/2014 7/20/2014
1 002 7/20/2014 7/20/2014
2 001 7/20/2014 7/20/2014
2 001 7/20/2014 8/01/2014
2 002 7/25/2014 8/01/2014
2 002 7/26/2014 8/01/2014
2 002 7/28/2014 8/01/2014
2 002 7/28/2014 8/01/2014
在最新购买之前,为每位客户追加最近购买日期的最有效方法是什么。所以,结果看起来像这样。
id pID b_Date p_Date latest_purchase_date
1 001 7/20/2014 7/20/2014 'N/A'
1 001 7/20/2014 7/20/2014 'N/A'
1 002 7/20/2014 7/20/2014 'N/A'
2 001 7/20/2014 7/20/2014 'N/A'
2 001 7/20/2014 8/01/2014 7/20/2014
2 002 7/25/2014 8/01/2014 7/20/2014
2 002 7/26/2014 8/01/2014 7/20/2014
2 002 7/28/2014 8/01/2014 7/20/2014
2 002 7/28/2014 8/01/2014 7/20/2014
我正在使用Teradata 13.1
答案 0 :(得分:1)
Teradata中没有LAG,但很容易重写。
由于有多个行具有相同的p_date,因此无论何时更改都需要跟踪。
SELECT id, pid, b_date, p_date
,MAX(last_dt) -- fill the NULLs with the last date
OVER (PARTITION BY id ORDER BY p_date, last_dt DESC
ROWS UNBOUNDED PRECEDING)
FROM
(
SELECT id, pid, b_date, p_date,
NULLIF(MIN(p_date) -- return the date only when there's a change, otherwise NULL
OVER (PARTITION BY id ORDER BY p_date
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)
, p_date) AS last_dt
FROM vt
) AS dt