我的数据与此示例类似,非常庞大,因此理想情况下需要高效的代码。我想在充电器维修后找到那些只有那些维修的客户ID的交易。
TRANSACTION_ID REPAIR_DATE CUSTOMER_ID COMPONENT LABOR_CODE_DESC Size....
28289 6/25/2015 AH123 LAPTOP CHARGER REPAIR big
28235 6/29/2015 AH123 LAPTOP CHIP REPLACE small
258978 6/27/2013 HW687 PHONE TOUCH SCREEN
28223 6/2/2014 AH123 LAPTOP BATTERY REPAIR
215678 7/28/2014 HW687 PHONE SIM REPAIR
527808 7/30/2016 HW687 LAPTOP BATTERY REPAIR
567976 7/28/2014 HW687 LAPTOP CHARGER REPAIR big
7678698 8/68/2015 AH123 LAPTOP BATTERY REPAIR
9987908 5/7/2006 TU890 PHONE SIM REPAIR
.....
OUTPUT
TRANSACTION_ID REPAIR_DATE CUSTOMER_ID COMPONENT LABOR_CODE_DESC ....
28235 6/29/2015 AH123 LAPTOP CHIP REPLACE
7678698 8/68/2015 AH123 LAPTOP BATTERY REPAIR
527808 7/30/2016 HW687 LAPTOP BATTERY REPAIR
.....
dont need:
215678 9/7/2014 HW687 PHONE SIM REPAIR
因为它与充电器维修的日期相同。我尝试了以下代码
SELECT *
FROM tab
QUALIFY
Max(CASE WHEN LABOR_CODE_DESC = 'CHARGER REPAIR' THEN 1 ELSE 0 END)
Over (PARTITION BY CUSTOMER_ID
ORDER BY REPAIR_DATE
ROWS BETWEEN Unbounded Preceding AND 1 Preceding) >= 1
通过使用这个我错过了一些交易发生在同一天充电器维修日期可能是因为它按修理日期排序。我不妨忽略所有与充电器维修日期相同的交易,以避免此问题。我也想根据尺寸进行限制。我可以在哪里加入?请提出最有效的方法,因为我的桌子太大了。
答案 0 :(得分:2)
这在Teradata中有效吗?
SELECT *
FROM tab
QUALIFY REPAIR_DATE > Max(CASE WHEN LABOR_CODE_DESC = 'CHARGER REPAIR' THEN REPAIR_DATE
END) Over (PARTITION BY CUSTOMER_ID
ORDER BY REPAIR_DATE
ROWS BETWEEN Unbounded Preceding AND 1 Preceding
);
这样可以在充电器维修之后的日期开始。