从具有复杂条件的两个略微相似的表中选择

时间:2018-02-17 09:14:53

标签: sql oracle

我在oracle数据库中有4个表,它们具有复杂的关系,并且没有有用的主键。

表A

+------+------+------+------+------+-----------------+
| ColA | ColX | ColY | ColZ | ColZa|      A          |
+------+------+------+------+------+-----------------+
| k9   | a1   | c1   | g1   | z1   | 2018-02-19      |
| k9   | a1   | c1   | g3   | z2   | 2018-02-02      |
| k10  | a2   | f3   | g1   | z3   | 2018-02-09      |
| k10  | a    | b    | c    | d    | 2018-02-03      |
| k    | a    | b    | c1   | z2   | 2018-02-01      |
| k9   | a1   | c1   | c9   | z5   | 2018-02-04      |
| k9   | a1   | c1   | c2   | z5   | 2018-02-03      |
| k9   | a1   | c1   | g2   | z5   | 2018-02-03      |
+------+------+------+------+------+-----------------+

表B

+------+------+------+------+------+----------------+
| ColA | ColX | ColY | ColZ | ColZa|      B         |
+------+------+------+------+------+----------------+
| e    | a3   | f    | g1   | i    | 2018-02-03     |
| e3   | a1   | f1   | g3   | d2   | 2018-02-04     |
| k9   | a1   | c1   | g2   | z5   | 2018-02-08     |
| e4   | a4   | f2   | g2   | i2   | 2018-02-07     |
| e5   | a1   | f1   | g1   | d2   | 2018-02-06     |
| k9   | a1   | c1   | g1   | d2   | 2018-02-22     |
+------+------+------+------+------+----------------+

表C

+------+------+------+----------------+
| ColA | ColX | ColY |      C         |
+------+------+------+----------------+
| ab   | c2   | c2   | cx             |
| k9   | a1   | c1   | cy             |
| cd   | a2   | c3   | cy             |
| ef   | c2   | c4   | cz             |
| ef   | c2   | c2   | cz             |
+------+------+------+----------------+

表D

+------+------+------+----------------+
| ColA | ColX | ColY |       D        |
+------+------+------+----------------+
| e    | a    | f    | dx             |
| e1   | a    | a    | dy             |
| e2   | a1   | a1   | dz             |
+------+------+------+----------------+

某些业务逻辑要求我选择并合并来自TableATableB的数据 问题: 对于伪密钥ColA_ColX_ColY具有值ColZ =' g1',并在ColA, ColX, ColY, ColZ, ColZa, A, B上合并的情况,在TableA和/或TableB中获取记录ColA | ColX | ColY | ColZ | ColZa。 我使用了“伪'这里因为它不是真正的关键,但它只是识别TablesA和TablesB中感兴趣的记录的一种手段。

要构造一个有效的密钥,对于TableC和TableD中的colX,count(colY)必须为1(这实际上是所有四个表中的情况,但是如果你只考虑不同的值,但我想只使用TableC和TableD因为它更明确)

流程: 在下面的结果表中,我应该在TableA表中获得row1,因为' a1'在TableC中只有一个计数(ColY)= 1,但我忽略了TableB中的row1和TableA中的row3,因为TableC或TableD中的count(ColY)不等于1 现在我有一个价值' a1'从TableC.ColX符合我的标准,我选择TableA和TableB中的所有记录,其中ColX =' a1'和ColY =' c1'和ColA =' k9'

我想要的结果

+------+------+------+------+------+-----------------+----------------+
| ColA | ColX | ColY | ColZ | ColZa|      A          |        B       |
+------+------+------+------+------+-----------------+----------------|
| k9   | a1   | c1   | g1   | z1   | 2018-02-19      |    [null]      |
| k9   | a1   | c1   | g3   | z2   | 2018-02-02      |    [null]      |
| k9   | a1   | c1   | c9   | z5   | 2018-02-04      |    [null]      |
| k9   | a1   | c1   | c2   | z5   | 2018-02-03      |    [null]      |
| k9   | a1   | c1   | g2   | z5   | 2018-02-03      | 2018-02-08     |
| k9   | a1   | c1   | g4   | d2   |     [null]      | 2018-02-22     |
+------+------+------+------+------+-----------------+----------------+

所以,我写了一个类似于

的查询
select a.ColX, a.ColY, a.ColZ, a.ColZa, a.A, b.B from TableA a FULL OUTER JOIN TableB b ON a.ColX=b.ColX AND a.ColY=b.ColY AND a.ColZ=b.ColZ 
 where (
   a.ColX IN 
   (select ColX from TableA where 
     ColX IN 
        (select ColX from TableC group by ColX HAVING count(ColY)=1) and 
     ColX in 
        (select distinct ColX from TableB where ColZ = 'g1'and B > trunc(sysdate) - 365) 
    group by ColX having count(distinct ColY)=1) 


 OR 
   b.ColX IN
   (select ColX from TableA where 
     ColX IN
        (select ColX from TableC group by ColX HAVING count(ColY)=1) and
     ColX in
        (select distinct ColX from TableB where ColZ = 'g1' and B > trunc(sysdate) - 365)
    group by ColX having count(distinct ColY)=1));

我无法控制数据模型。如何使我的查询工作? TableA和TableB中的数据为100,000条记录,TableC和TableD中的数据高达一百万条。

SQL不是我的专业领域,我真的希望我在这里不会太过分了。

1 个答案:

答案 0 :(得分:1)

我不明白你的查询应该做什么,但作为一个纯粹的重构练习,我得到了这个:

with whatever as
       ( select colx
         from   tablea
         where  colx in
                ( select colx
                  from   tablec
                  group by colx having count(colb) = 1
                  union all
                  select colx
                  from   tableb
                  where  colz = 'g1'
                  and    b > trunc(sysdate) - 365 )
         group  by colx
         having count(distinct colza) = 1 )
select a.colx, a.coly, a.colz, a.colza, a.a, b.b
from   tablea a
       full outer join tableb b
            on  a.colx = b.colx
            and a.coly = b.coly
            and a.colz = b.colz
       join whatever w
            on w.colx in (a.colx, b.colx);