Question

我很难弄清楚如何为我的用例编写递归 CTE。

**TEMP** TABLE (CAN BE MODIFIED): Employee_Money_Accounts 
-- Primary key on Employer_ID, Account_ID

<头>

id	雇主 ID	Account_ID	get_date (int)
1	1	5	20210105
2	2	8	20210104
3	1	1145	20210105

TABLE: Employee_Money_Accounts_Past 
-- Primary key: as_of_dte ASC, employer_ID ASC, account_ID ASC

<头>

雇主 ID	account_ID	as_of_dte (int)	钱
1	5	20201215	5.00
1	5	20201201	8.00
2	8	20201201	15.00

我在 Employee_Money_Accounts_Past 中有数百万条记录，在 Employee_Money_Accounts 中有数千条记录。

我需要在给定特定上限的情况下为每个 account_ID 提取 MAX as_of_dte。

SELECT EP.Employer_ID, EP.Account_ID, MAX(as_of_dte) 
FROM   Employee_Money_Accounts_Past EP
INNER JOIN Employee_Money_Accounts EA ON EP.Employer_ID = EA.Employer_ID 
                                     AND EP.Account_ID  = EA.Account_ID
WHERE EP.as_of_dte <= EA.get_date
GROUP BY EP.Employer_ID, EP.Account_ID

上面的查询太慢了，所以我想写一个递归 CTE（也不是 WHILE 循环）来处理这个问题。

这是我到目前为止所拥有的 - 也超级慢！基本上我希望能够使用递归 CTE 一次将一个 Employer_ID 和 Account_ID 传递给主查询，因为这很快。

;WITH EmpAccts AS (Employer_ID, Account_ID)
(
    SELECT Employer_ID, Account_ID
    FROM #Employee_Money_Accounts
    UNION ALL
    SELECT EA.Employer_ID, EA.Account_ID
    FROM EmpeAccts E
    INNER JOIN #Employee_Money_Accounts EA ON E.Employer_ID = EA.Employer_ID
                                          AND E.Account_ID = EA.Account_ID
    WHERE EA.id = EA.id + 1
)
SELECT EA2.Employer_ID, EA2.Account_ID, MAX(EP.as_of_dte)
FROM EmpAccts EA2
INNER JOIN Employee_Money_Accounts_Past EP ON EA2.Employer_ID = EP.Employer_ID 
                                          AND EA2.Account_ID = EP.Account_ID
INNER JOIN Employee_Money_Accounts EMP ON EP.Employer_ID = EMP.Employer_ID
                                      AND EP.Account_ID = EMP.Account_ID
WHERE EP.as_of_dte <= EMP.as_of_dte
GROUP BY EA2.Employer_ID, EA2.Account_ID

Answer 1

一些技巧：

在表 Employee_Money_Accounts 中有一个名为 id 的列，它似乎是一个标识列。如果是这种情况，为什么不将其设为主键列（并且可能在 Employer_ID、Account_ID 上添加备用键约束？然后在 Employee_Money_Accounts_Past 中将 FK 引用添加到 id 列。我不确定但也许这可以帮助加快问题的速度。

此外，在处理表中的数百万行并解决此类问题时，您可以考虑在 Employee_Money_Accounts_Past 上添加列存储索引。列存储索引最多可将性能提高 10 倍。这在 OLTP 系统中的数据仓库和直接分析中很常见。

编写递归 CTE

1 个答案: