已使用sqlfilld SQLfiddle
更新我有一个Oracle查询,需要减少左外部联接的数量以有效执行。当前查询运行了2个小时以上,我想通过减少联接操作的次数来降低其复杂性。
没有连接,查询将在15分钟内运行。因此,我想重写逻辑。有什么有效的方法吗?
WITH myquery AS
(
SELECT *
FROM TEST_FILE1
)
SELECT
A.Col3, A.Col1, A.Col2, A.Col4, A.Col5
-- D.CB,
-- NVL(D.CD, 0), NVL(D.CE, 0), NVL(D.EF, 0),
,CASE WHEN V1.Col1 IS NULL THEN 0 ELSE 1 END AS QQ1
,CASE WHEN V2.Col3 IS NULL THEN 0 ELSE 1 END AS QQ2
,CASE WHEN V3.Col1 IS NULL THEN 0 ELSE 1 END AS QQ3
,CASE WHEN V4.Col3 IS NULL THEN 0 ELSE 1 END AS QQ4
, case when V5.Col1 is NULL then 0 else 1 end as QQ5
, case when V6.Col3 is NULL then 0 else 1 end as QQ6
, case when V7.Col1 is NULL then 0 else 1 end as QQ7
, case when V8.Col3 is NULL then 0 else 1 end as QQ8
FROM (
SELECT Col3, Col1, Col2, Col4, Col5
FROM (
SELECT distinct Col3
FROM myquery
) A1
CROSS JOIN (
SELECT distinct Col1
FROM myquery
) A2
CROSS JOIN (
SELECT distinct Col2
FROM myquery
) A3
CROSS JOIN (
SELECT distinct Col4
FROM myquery
) A4
CROSS JOIN (
SELECT distinct Col5
FROM myquery
) A5
WHERE Col3 = 42
) A
LEFT JOIN myquery D on NVL(D.Col3, '-') = NVL(A.Col3, '-') AND NVL(D.Col1, '-') = NVL(A.Col1, '-')
AND NVL(D.Col2, '-') = NVL(A.Col2, '-') AND NVL(D.Col4, '-') = NVL(A.Col4, '-') AND NVL(D.Col5,
'-') = NVL(A.Col5, '-')
LEFT JOIN (
SELECT distinct Col1, Col3, Col5
FROM myquery
) V1 on V1.Col1 = A.Col1 AND V1.Col3 = A.Col3 AND V1.Col5 = A.Col5
LEFT JOIN (
SELECT distinct Col3, Col5, Col2
FROM myquery
) V2 on V2.Col3 = A.Col3 AND V2.Col5 = A.Col5 AND V2.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col5, Col1, Col2
FROM myquery
) V3 on V3.Col3 = A.Col3 AND V3.Col5 = A.Col5 AND V3.Col1 = A.Col1 AND V3.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col5, Col2
FROM myquery
WHERE Col1 in ('Bert','Myra')
) V4 on V4.Col3 = A.Col3 AND V4.Col5 = A.Col5 AND V4.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col1, Col3
FROM myquery
) V5 on V5.Col1 = A.Col1 AND V5.Col3 = A.Col3
LEFT JOIN (
SELECT distinct Col3, Col2
FROM myquery
) V6 on V6.Col3 = A.Col3 AND V6.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col1, Col2
FROM myquery
) V7 on V7.Col3 = A.Col3 AND V7.Col1 = A.Col1 AND V7.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col2
FROM myquery
WHERE Col1 in ('Bert','Myra')
) V8 on V8.Col3 = A.Col3 AND V8.Col2 = A.Col2
到目前为止,我一直在考虑使用分析窗口函数,但未获得所需的输出。任何线索都将受到高度赞赏。
这是我的test_file表的输入数据
+------+------+------+------+------+ | COL1 | COL2 | COL3 | COL4 | COL5 | +------+------+------+------+------+ | Bert | "M" | 42 | 68 | 166 | | Carl | "M" | 32 | 70 | 155 | | Dave | "M" | 39 | 72 | 167 | | Elly | "F" | 30 | 66 | 124 | | Fran | "F" | 33 | 66 | 115 | | Hank | "M" | 30 | 71 | 158 | | Jake | "M" | 32 | 69 | 143 | | Luke | "M" | 34 | 72 | 163 | | Neil | "M" | 36 | 75 | 160 | | Page | "F" | 31 | 67 | 135 | | Alex | "M" | 41 | 74 | 170 | | Gwen | "F" | 26 | 64 | 121 | | Ivan | "M" | 53 | 72 | 175 | | Kate | "F" | 47 | 69 | 139 | | Myra | "F" | 23 | 62 | 98 | | Omar | "M" | 38 | 70 | 145 | | Quin | "M" | 29 | 71 | 176 | | Ruth | "F" | 28 | 65 | 131 | +------+------+------+------+------+
在此表中,我想通过应用交叉联接通过获取各列的不同值来创建每种可能的组合。它将在我的过滤器上col1=42
上产生7776条记录。因为我只希望此列的所有可能组合。
通过这种组合,我想使用左外部联接的许多组合来检查所有列组合为空。
输出(部分):
+------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+ | COL3 | COL1 | COL2 | COL4 | COL5 | QQ1 | QQ2 | QQ3 | QQ4 | QQ5 | QQ6 | QQ7 | QQ8 | +------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+ | 42 | Page | "F" | 68 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Alex | "F" | 62 | 143 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Fran | "M" | 66 | 175 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Omar | "F" | 70 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Elly | "M" | 72 | 124 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Quin | "M" | 64 | 160 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Omar | "M" | 64 | 158 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Kate | "F" | 62 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Neil | "F" | 69 | 145 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Dave | "F" | 62 | 163 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 42 | Ruth | "M" | 70 | 115 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Bert | "M" | 65 | 121 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | | 42 | Bert | "M" | 72 | 145 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | | 42 | Omar | "M" | 62 | 158 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | | 42 | Ruth | "M" | 75 | 131 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | +------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+
答案 0 :(得分:1)
在检查表中是否存在数据时,我们使用EXISTS
或IN
,而不是JOIN (SELECT DISTINCT ...)
。因此,这是我可能想出的查询:
WITH myquery AS
(
SELECT * FROM TEST_FILE1
)
, a as
(
select col1, col2, 42 as col3, col4, col5
from
(
(select distinct col1 from myquery)
cross join
(select distinct col2 from myquery)
cross join
(select distinct col4 from myquery)
cross join
(select distinct col5 from myquery)
)
)
select
a.col1, a.col2, a.col3, a.col4, a.col5,
case when (col1, col3, col5) in (select col1, col3, col5 from myquery ) then 1 else 0 end as v1,
case when (col2, col3, col5) in (select col2, col3, col5 from myquery ) then 1 else 0 end as v2,
case when (col1, col2, col3, col5) in (select col1, col2, col3, col5 from myquery ) then 1 else 0 end as v3,
case when (col2, col3, col5) in (select col2, col3, col5 from myquery where col1 in ('Bert', 'Myra')) then 1 else 0 end as v4,
case when (col1, col3) in (select col1, col3 from myquery ) then 1 else 0 end as v5,
case when (col2, col3) in (select col2, col3 from myquery ) then 1 else 0 end as v6,
case when (col1, col2, col3) in (select col1, col2, col3 from myquery ) then 1 else 0 end as v7,
case when (col2, col3) in (select col2, col3 from myquery where col1 in ('Bert', 'Myra')) then 1 else 0 end as v8
from a
order by a.col1, a.col2, a.col3, a.col4, a.col5;
如果您在这里的真实查询:WITH myquery AS (...)
不仅仅是一个SELECT * FROM TEST_FILE1
,那么您可能想在这里使用/*+MATERIALIZE*/
提示以加快访问速度。