如何用外连接和多个OR标准编写查询?

时间:2017-11-17 11:01:15

标签: sql postgresql

在此查询中,我们从大表release到两个表release_eventrelease_barcode执行外连接。这些必须是外连接,因为在结果中我们想要返回在两个表中都没有条目的版本:

SELECT r.source_uri AS su_on_r, r.title AS t_on_r, release_event.cat, release_barcode.barcode
FROM   release r 
    left join release_event event ON r.source_uri = event.source_uri 
    left join release_barcode barcode ON r.source_uri = barcode.source_uri 
WHERE  ( event.cat IN ( '3145422912' ) AND event.label_name = 'UTV Records' ) 
    OR barcode.barcode IN ( '731454229128' ) 

分析时,会对release表执行顺序扫描:

|   ->  Sort  (cost=2169219.61..2169219.65 rows=13 width=66)                                                                                                                  |
|         Sort Key: r.source_uri                                                                                                                                              |
|         ->  Hash Left Join  (cost=721970.94..2169219.37 rows=13 width=66)                                                                                                   |
|               Hash Cond: ((r.source_uri)::text = (barcode.source_uri)::text)                                                                                                |
|               Filter: ((((event.cat)::text = '3145422912'::text) AND ((event.label_name)::text = 'UTV Records'::text)) OR ((barcode.barcode)::text = '731454229128'::text)) |
|               ->  Hash Right Join  (cost=589927.01..1568193.22 rows=11405330 width=88)                                                                                      |
|                     Hash Cond: ((event.source_uri)::text = (r.source_uri)::text)                                                                                            |
|                     ->  Seq Scan on release_event event  (cost=0.00..283078.30 rows=11405330 width=65)                                                                      |
|                     ->  Hash  (cost=324393.56..324393.56 rows=10963956 width=66)                                                                                            |
|                           ->  Seq Scan on release r  (cost=0.00..324393.56 rows=10963956 width=66)                                                                          |
|               ->  Hash  (cost=63125.97..63125.97 rows=2965197 width=60)                                                                                                     |
|                     ->  Seq Scan on release_barcode barcode  (cost=0.00..63125.97 rows=2965197 width=6

如果我只有上述标准之一,则查询计划程序(Postgresql)会理解这一点,使用索引来限制release的结果:

|   ->  Sort  (cost=42.26..42.27 rows=3 width=66)                                                                               |
|         Sort Key: r.source_uri                                                                                                |
|         ->  Nested Loop  (cost=0.99..42.24 rows=3 width=66)                                                                   |
|               ->  Index Scan using release_barcode_barcode_idx on release_barcode barcode  (cost=0.43..16.47 rows=3 width=47) |
|                     Index Cond: ((barcode)::text = '731454229128'::text)                                                      |
|               ->  Index Scan using release_pkey on release r  (cost=0.56..8.58 rows=1 width=66)                               |
|                     Index Cond: ((source_uri)::text = (barcode.source_uri)::text)         

如何使用外连接和多个OR标准重写查询以首先限制连接表,并仅查找匹配的任何行的release表?

3 个答案:

答案 0 :(得分:2)

加入前过滤:

select 
    source_uri as su_on_r, 
    r.title as t_on_r, 
    event.cat, 
    barcode.barcode
from  
    release r 
    left join (
        select *
        from release_event
        where cat in ('3145422912') and label_name = 'utv records'
    ) event using (source_uri)
    left join (
        select *
        from release_barcode 
        where barcode in ('731454229128') 
    ) barcode using (source_uri)

答案 1 :(得分:1)

您可以尝试使用exists

来编写此内容
SELECT r.source_uri as su_on_r, r.title as t_on_r
FROM release r 
WHERE EXISTS (SELECT 1
              FROM release_event re
              WHERE r.source_uri = re.source_uri AND
                    re.cat IN ( '3145422912' ) AND re.label_name = 'UTV Records'
             ) OR
      EXISTS (SELECT 1
              FROM release_barcode rb 
              WHERE r.source_uri = rb.source_uri AND
                    rb.barcode IN ( '731454229128' ) 
             );

对于这些查询,我建议在release_event(source_uri, cat, label_name)release_barcode(source_uri, barcode)上建立索引。

您还可以使用union(删除重复项)来表达查询:

SELECT r.source_uri AS su_on_r, r.title AS t_on_r
FROM release r JOIN
     release_event event
     ON r.source_uri = event.source_uri 
WHERE  event.cat IN ( '3145422912' ) AND event.label_name = 'UTV Records' 
UNION
SELECT r.source_uri AS su_on_r, r.title AS t_on_r
FROM release r JOIN
     release_barcode
     ON r.source_uri = barcode.source_uri 
WHERE barcode.barcode IN ( '731454229128' )  ;

答案 2 :(得分:1)

在黑暗中这是一个镜头,但是如果你在左连接中移动了where条件然后进行测试以查看连接中是否有/或记录是成功的呢?

SELECT
  r.source_uri AS su_on_r, r.title AS t_on_r, release_event.cat, release_barcode.barcode
FROM
  release r 
  left join release_event event ON
    r.source_uri = event.source_uri  and
    event.cat IN ( '3145422912' ) AND 
    event.label_name = 'UTV Records'
  left join release_barcode barcode ON 
     r.source_uri = barcode.source_uri and
     barcode.barcode IN ( '731454229128' ) 
WHERE
  event.source_uri is not null or
  barcode.source_uri is not null

@a_horse_with_no_name发表了关于你的构造如何实际成为内部联接的评论,我认为这将克服这一点。无论是获得结果(以及你所寻求的表现),我都不确定。