SQL:速度提升 - 杂乱的联合查询

时间:2011-02-16 23:17:25

标签: sql join left-join sql-optimization

SELECT * FROM (
    SELECT       a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
    FROM         current_tbl a
    INNER JOIN   import_tbl  b 
                 ON ( a.user_id = b.user_id )
    UNION
    SELECT       a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
    FROM         current_tbl a
    INNER JOIN   import_tbl  b 
                 ON (   lower(a.f_name)=lower(b.f_name) 
                    AND lower(a.l_name)=lower(b.l_name) ) 
) foo
--
UNION
--
SELECT a.user_id , a.f_name , a.l_name , '' , '' , '' 
FROM   current_tbl a
WHERE  a.user_id NOT IN (
   select user_id from(
      SELECT       a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
      FROM         current_tbl a
      INNER JOIN   import_tbl  b 
                   ON ( a.user_id = b.user_id )
      UNION
      SELECT       a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name
      FROM         current_tbl a
      INNER JOIN   import_tbl  b 
                   ON (   lower(a.f_name)=lower(b.f_name) 
                      AND lower(a.l_name)=lower(b.l_name) ) 
   ) bar
)
ORDER BY user_id

表格填充示例

current_tbl:

-------------------------------
user_id  |  f_name  |  l_name
---------+----------+----------
  A1     |  Adam    |  Acorn
  A2     |  Beth    |  Berry
  A3     |  Calv    |  Chard
         |          |

import_tbl:

-------------------------------
user_id  |  f_name  |  l_name
---------+----------+----------
  A1     |  Adam    |  Acorn
  A2     |  Beth    |  Butcher  <- last_name different
         |          |

预期产出:

-----------------------------------------------------------------------
user_id1  |  f_name1  |  l_name1  |  user_id2  |  f_name2  |  l_name2
----------+-----------+-----------+------------+-----------+-----------
   A1     |  Adam     |  Acorn    |     A1     |  Adam     |  Acorn       
   A2     |  Beth     |  Berry    |     A2     |  Beth     |  Butcher
   A3     |  Calv     |  Chard    |            |           |           

执行此方法可以摆脱行所在的条件:

   A2     |  Beth     |  Berry    |     A2     |  Beth     |  Butcher

但它保留了A3行


我希望这是有道理的,我没有过分简化它。这是我other question的延续问题。这些改进的继承将查询从大约32000毫秒降低到现在大约1200毫秒 - 相当大的改进。

我认为我可以通过在子查询中使用UNION ALL进行优化,当然还有通常的索引优化,但我正在寻找最佳的SQL优化。仅供参考,这个特例适用于PostgreSQL。

1 个答案:

答案 0 :(得分:1)

我认为这几乎是相同的,更小,似乎 对我更有意义。我的第一直觉是它 应该跑得更快,但可能不是最好的:))

SELECT       a.user_id, a.f_name, a.l_name, 
             COALESCE(b.user_id, ''), COALESCE(b.f_name, ''), COALESCE(b.l_name, '')
FROM         current_tbl a
LEFT OUTER JOIN import_tbl  b ON
   ( a.user_id = b.user_id ) OR
   ( lower(a.f_name)=lower(b.f_name) 
     AND lower(a.l_name)=lower(b.l_name) ) 

编辑: 嘲笑自己或多或少地建议你扭转以前的变化 你在原来的问题中做过。