如何优化PostrgreSQL查询并避免使用" NOT IN"?

时间:2014-11-21 06:16:32

标签: postgresql

PostgreSQL的新手问题。我有以下查询,其中我试图返回符合以下条件的所有记录的联合:

  1. 所有dx在tencounter之前都标记为慢性,
  2. 所有经常性的dx(即,标记处理的dx记录比之前的任何分辨率更新,
  3. 任何标记为已处理但尚未解决的dx记录
  4. 有没有更好的方法(可能使用PostgreSQL的WITH子句)?我没有给我们读过" NOT IN"在PostgreSQL中,这样做有多好?你如何"优化"这个东西?

       CREATE OR REPLACE FUNCTION f_getactivedx(groupid character varying, tencounter timestamp without time zone)
       RETURNS SETOF view_dx AS
    $BODY$
    
        select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted) as tposted, 
        bool_or(dx.resolved) as resolved, bool_or(dx.treated) as treated, bool_or(dx.chronic),
        dx.groupid 
            from dx 
        where dx.chronic = true 
        and dx.groupid = $1 
        and date_trunc('day',dx.tposted) <= date_trunc('day',$2)
        group by dx.cicd9, dx.cdesc, dx.groupid
    
        union
    
    
        select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted) as tposted, 
        bool_and(dx.resolved), bool_and(dx.treated), bool_and(dx.chronic), dx.groupid 
        from dx 
                join    (select cdesc, max(tposted) as tposted from dx  
                           where groupid =$1  and resolved = true and 
                           date_trunc('day',tposted) <=  date_trunc('day', $2)
                           group by cdesc) j 
                  on (dx.cdesc = j.cdesc and dx.tposted > j.tposted) 
                  where groupid = $1 and treated = true
                  and date_trunc('day',dx.tposted)  <= date_trunc('day', $2)
                  group by dx.cicd9, dx.cdesc, dx.groupid 
    
        union 
    
        select max(dx.recid) as recid, dx.cicd9, dx.cdesc, max( dx.tposted), 
           bool_and(dx.resolved), bool_and(dx.treated), bool_and(dx.chronic), dx.groupid 
                  from dx 
                  where dx.cdesc NOT IN
                    (select cdesc from dx  
                           where groupid =$1  and resolved = true and 
                           date_trunc('day',tposted) <=  date_trunc('day', $2)
                           group by cdesc) 
                 and groupid =$1 and treated = true and
                  date_trunc('day',tposted)   <= date_trunc('day', $2)        
                  group by dx.cicd9, dx.cdesc, dx.groupid
    
        order by tposted desc, treated desc, resolved desc, cdesc asc 
    

2 个答案:

答案 0 :(得分:1)

NOT IN可以,你只需仔细考虑一下NULL,反连接通常是更好的选择。

对于任何查询:

SELECT ...
FROM t
WHERE col NOT IN (SELECT col2 FROM t2 WHERE col2 IS NOT NULL AND ...predicate...)

你可以等同地写:

SELECT ...
FROM t LEFT OUTER JOIN t2 ON (t.col = t2.col2 AND ...predicate...)
WHERE t2.col2 IS NULL;

被称为“左反连接”。

PostgreSQL可能会为两者生成相同的查询计划。

除非您知道由于某些原因导致NOT IN使用导致性能问题,否则经过explain analyze的正确检查后,我强烈建议您不要单独使用。

答案 1 :(得分:1)

使用NOT EXISTS通常比使用NOT IN更有效。

SELECT A.*
FROM A
WHERE NOT EXISTS (
  SELECT 1
  FROM B
  WHERE A.id = B.id
)