分层查询与多个表匹配具有挑战性

时间:2018-05-13 08:02:10

标签: sql sql-server tsql

我有这个business_table

ref_ID      name    parent_id 
-----------------------------
ABC-0001    Amb     NULL 
PQR-899     boss    NULL
tgv-632     pick    NULL
yyy-888     xyz     NULL
kkk-456     ued     NULL

我想更新business_table的parent_id

parent_customer是另一个表,它列出了下面给出的ref_ID和parent_id的层次结构。

更新business_table的parent_id是

1)使用parent_customer的ref_id检查business_table的ref_id。例如。 ref_ID business_table的ABC-0001与parent_customer匹配ref_id第1行1 ref_id-ABC-0001 opr-656 匹配发现

2)然后检查匹配记录的parent_customer的parent_id,在这种情况下,使用match_table_CM表检查parent_id opr-656

match_table_CM表列出了我们想要在更新记录之前匹配的ID(我们正在检查这个因为这是CRM id需要检查emplpoyee不存在)

3)匹配未找到然后检查parent_customer的parent_id opr-656与同一个表parent_customer ref_id,与ref_id找到的第二个记录opr-656 然后选择其parent_id ttK-668检查,找到match_table_CM匹配1 ttK-668然后用business_table parent_id更新其他明智的检查,直到 parent_customer ref_ID = parent_id(所有的父)并更新该ID即使匹配未找到,所以在这种情况下,如果未找到匹配则ttK-668应该是 最后更新

注意: - parent_customer表列出了数据层次结构,当ref_id和parent_id相同时,表示它是整个层次结构的父级。

例如:

4 PQR-899 PQR-899这是等级的最终父母

parent_customer

ID  ref_id     parent_id  
---------------------------
1   ABC-0001   opr-656
2   opr-656    ttK-668
3   ttK-668    ttK-668
4   PQR-899    PQR-899
5   kkk-565    AJY-567  
6   AJY-567    UXO-989
7   UXO-989    tgv-632
8   tgv-632    mnb-784 
9   mnb-784    qwe-525 
10  qwe-525    qwe-525
11  kkk-456    jjj-888

match_table_CM:

id    main_id
--------------
1     ttK-668
2     PQR-899
3     tgv-632
4     mnb-784

预期输出

ref_ID      name    parent_id 
-----------------------------
ABC-0001    Amb     ttK-668                    
PQR-899     boss    PQR-899
tgv-632     pick    qwe-525
yyy-888     xyz     NULL
kkk-456     ued     jjj-888

2 个答案:

答案 0 :(得分:0)

这应返回预期结果:

WITH hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN mt.main_id IS NOT NULL OR pc.parent_id = pc.ref_id THEN 1 ELSE 0 END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   LEFT JOIN match_table_CM AS mt
     ON mt.main_id = pc.ref_id
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
SELECT * FROM hierarchy AS h
WHERE lvl = 
 ( -- return the last row, matching or not
   SELECT Max(lvl)
   FROM hierarchy AS h2
   WHERE h.ref_id = h2.ref_id
 );

编辑:

使用EXISTS重写,因为SQL Server在递归部分中不支持外部联接:

WITH hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN exists
            ( select * 
              from match_table_CM AS mt
              where mt.main_id = pc.ref_id
            ) OR pc.parent_id = pc.ref_id
           THEN 1
           ELSE 0
      END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
SELECT * FROM hierarchy AS h
WHERE lvl = 
 ( -- return the last row, matching or not
   SELECT Max(lvl)
   FROM hierarchy AS h2
   WHERE h.ref_id = h2.ref_id
 );

优化程序的计划看起来很糟糕,因此使用窗口聚合而不是相关子查询进行另一次重写:

WITH  hierarchy AS
 ( -- all rows from source table
   SELECT b.ref_id, pc.parent_id, 
      0 AS match,
      1 AS lvl
   FROM business_table AS b
   LEFT JOIN parent_customer AS pc
     ON b.ref_id = pc.ref_id

   UNION ALL

   SELECT h.ref_id, pc.parent_id, 
      -- check if we found a match or reached top of hierarchy
      CASE WHEN exists
            ( select * 
              from match_table_CM AS mt
              where mt.main_id = pc.ref_id
            ) OR pc.parent_id = pc.ref_id
           THEN 1
           ELSE 0
      END,
      lvl+1
   FROM hierarchy AS h
   JOIN parent_customer AS pc 
     ON pc.ref_id = h.parent_id -- going up in the hierarchy
   WHERE h.match = 0 -- no match yet
     AND lvl < 10 -- just in case there's an endless loop due to bad data
 )
select *
from 
 ( 
   SELECT h.*,
      max(lvl) over (partition by ref_id) as maxlvl
   FROM hierarchy AS h
 ) as dt
WHERE lvl = maxlvl
;

答案 1 :(得分:0)

您可以使用递归CTE获得最终父级:

with cte as (
      select pc.ref_id, pc.parent_id as ultimate_parent, 1 as lev
      from parent_customer pc
      where pc.ref_id = pc.parent_id
      union all
      select pc.ref_id, cte.ultimate_parent, lev + 1 
      from cte
           parent_customer pc
           on pc.parent_id = cte.ref_id and pc.ref_id <> pc.parent_id
    )
select *
from cte;

您可以将其放在update

with cte as (
      select pc.ref_id, pc.parent_id as ultimate_parent, 1 as lev
      from parent_customer pc
      where pc.ref_id = pc.parent_id
      union all
      select pc.ref_id, cte.ultimate_parent, lev + 1 
      from cte
           parent_customer pc
           on pc.parent_id = cte.ref_id and pc.ref_id <> pc.parent_id
    )
update bt
    set parent_id = cte.ultimate_parent
    from business_table bt join
         cte
         on cte.ref_id = bt.ref_id