Question

我们有一个类似的查询。

SELECT DISTINCT table_a.userid       AS userId,
                table_a.screenname   AS screenName,
FROM   table_a
       LEFT JOIN table_b
              ON ( table_a.userid = table_b.userid )
WHERE  ( table_b.organizationid IS NULL )
       AND ( table_a.companyid = '20002' )
       AND ( table_a.defaultuser = 'f' )
       AND ( ( Lower(table_a.firstname) LIKE '%7189%'
               AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.middlename) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.lastname) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.screenname) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL )
              OR ( Lower(table_a.emailaddress) LIKE '%7189%'
                   AND '%7189%' IS NOT NULL ) )
       AND ( table_a.status = '0' )

table_b.organizationid有一个非NULL约束，所以table_b.organizationid IS NOT NULL相当于1 = 2，但仍然postgres执行查询的剩余部分，创建临时表并在15秒后返回零行。

无需更改查询，是否可以提高此查询的性能。数据库版本是9.1，但9.3也有相同的行为。

Postgres版本：x86_64-unknown-linux-gnu上的PostgreSQL 9.1.14，由gcc编译（Ubuntu / Linaro 4.6.3-1ubuntu5）4.6.3,64位

table_a有230万行 table_b有540万行

table_a有40列（这里列出太多了） table_b有2列

     Column     |  Type  | Modifiers
----------------+--------+-----------
 userid         | bigint | not null
 organizationid | bigint | not null
Indexes:
    "table_b_pkey" PRIMARY KEY, btree (userid, organizationid)
    "ix_7ef4ec0e" btree (organizationid)
    "ix_fb646ca6" btree (userid)

解释计划

HashAggregate  (cost=638937.42..638937.43 rows=1 width=72) (actual time=21386.436..21386.436 rows=0 loops=1)
   ->  Hash Right Join  (cost=443599.10..638937.40 rows=1 width=72) (actual time=21386.433..21386.433 rows=0 loops=1)
         Hash Cond: (table_b.userid = table_a.userid)
         Filter: (table_b.organizationid IS NULL)
         ->  Seq Scan on table_b  (cost=0.00..95488.04 rows=6020704 width=16) (actual time=0.009..4158.880 rows=5497919 loops=1)
         ->  Hash  (cost=441059.52..441059.52 rows=104846 width=72) (actual time=12356.795..12356.795 rows=215 loops=1)
               Buckets: 16384  Batches: 2  Memory Usage: 12kB
               ->  Seq Scan on table_a  (cost=0.00..441059.52 rows=104846 width=72) (actual time=43.250..12355.735 rows=215 loops=1)
                     Filter: ((NOT defaultuser) AND (companyid = 20002::bigint) AND (status = 0) AND ((lower((firstname)::text) ~~ '%7189%'::text) OR (lower((middlename)::text) ~~ '%7189%'::text) OR (lower((lastname)::text) ~~ '%7189%'::text) OR (lower((screenname)::text) ~~ '%7189%'::text) OR (lower((emailaddress)::text) ~~ '%7189%'::text)))
 Total runtime: 21386.608 ms
(10 rows)

- Sameer

Answer 1

我知道你不想重写查询。但查询重写有助于规划人员。我会把这个查询写成：

SELECT DISTINCT table_a.userid AS userId, table_a.screenname AS screenName, FROM table_a LEFT JOIN (select userid from --i think only userid is needed in this query table_b table_b.organizationid IS NULL ) as table_b ON ( table_a.userid = table_b.userid ) WHERE ( table_a.companyid = '20002' ) AND ( table_a.defaultuser = 'f' ) AND ( ( Lower(table_a.firstname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.middlename) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.lastname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.screenname) LIKE '%7189%' AND '%7189%' IS NOT NULL ) OR ( Lower(table_a.emailaddress) LIKE '%7189%' AND '%7189%' IS NOT NULL ) ) AND ( table_a.status = '0' )

postgres查询优化时始终为false，其中存在条件

1 个答案: