我继承了一个我想要优化的旧SQL脚本,但经过多次测试后,我必须承认我的所有测试只会创建带有重复块的巨大SQL。我想知道是否有人可以为以下模式提出更好的代码(参见下面的代码)。我不想使用临时表(WITH)。为简单起见,我只放了3个级别(表TMP_C,TMP_D和TMP_E),但原始SQL有8个级别。
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
非常感谢您的帮助。
答案 0 :(得分:3)
首先,选择DISTINCT将防止结果集中出现重复,因此您处于过度处理状态。通过添加“WITH”定义并尝试嵌套它们的使用使得跟随更加困惑。数据最终全部来自“B”表,其中“A”中的键匹配也是如此。让我们从那开始......由于你没有在结果集中使用(B)Field_Y或(A)Field_X中的任何内容,所以不要将它们添加到混乱中。
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
where子句仅包含您想要的那些类别限定值,并且每个类别仍然具有结果。
现在,如果您确实需要“Field_Y”或“Field_X”中的其他值,那么这将生成不同的查询。但是,您的Tmp_C,Tmp_D和Tmp_E无论如何只要求ID和CATEG列。
答案 1 :(得分:0)
这可能表现更好
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...