查询计划程序不准确选择嵌套连接

时间:2013-10-31 21:13:37

标签: sql postgresql database-design indexing postgresql-performance

我有来自EXPLAIN ANALYZE

的这个
 ->  Nested Loop  (cost=2173.66..30075.48 rows=77 width=4)
                  (actual time=30.949..399.463 rows=95959 loops=1)

因此预期行数与实际行数差不多有3个数量级,这导致查询速度非常慢。

我将default_statistics_target提升为10000并运行VACUUM / ANALYZE以使查询计划程序与新统计信息保持同步。如何让查询规划器选择更好的连接策略?

我正在使用postgres 9.3.1。我的所有计划器成本常数仍然是默认值:

seq_page_cost: 1
random_page_cost: 4
cpu_tuple_cost: .01
cpu_index_tuple_cost: .005
cpu_operator_cost: .0025
effective_cache_size: 128MB

我设置了enable_nested_loops = false,但查询实际运行速度并不快。我的印象是,查询计划程序估计返回的行数存在很大差异,实际可能会导致查询计划不理想

整个查询计划如下:

Aggregate  (cost=30444.87..30444.88 rows=1 width=0) (actual time=535.077..535.077     rows=1 loops=1)
      ->  Nested Loop  (cost=2174.08..30444.68 rows=76 width=0) (actual time=23.208..527.062 rows=95451 loops=1)
        ->  Nested Loop  (cost=2173.66..30075.48 rows=77 width=4) (actual time=23.200..351.275 rows=95959 loops=1)
          ->  Hash Left Join  (cost=2173.24..28013.64 rows=401 width=4) (actual time=23.188..133.224 rows=103609 loops=1)
                Hash Cond: (access_rights.target_id = departments.id)
                Join Filter: ((access_rights.target_type)::text = 'Department'::text)
                Filter: ((((access_rights.target_type)::text = 'Company'::text) AND (access_rights.target_id = 173)) OR (((access_rights.target_type)::text = 'User'::text) AND (access_rights.target_id = 11654)) OR (((access_rights.target_type)::text = 'UserGroup'::text) AND (access_rights.target_id = 126)) OR (((access_rights.target_type)::text = 'Department'::text) AND (departments.lft <= 7) AND (departments.rgt >= 8)))
                Rows Removed by Filter: 59127
                ->  Bitmap Heap Scan on access_rights  (cost=2135.97..27236.01 rows=26221 width=14) (actual time=22.844..79.391 rows=162736 loops=1)
                      Recheck Cond: ((((target_type)::text = 'Company'::text) AND (target_id = 173) AND ((section)::text = 'shop'::text)) OR (((target_type)::text = 'User'::text) AND (target_id = 11654) AND ((section)::text = 'shop'::text)) OR (((target_type)::text = 'UserGroup'::text) AND (target_id = 126) AND ((section)::text = 'shop'::text)) OR ((target_type)::text = 'Department'::text))
                      Filter: (((section)::text = 'shop'::text) AND (((active_on IS NOT NULL) AND (active_on <= '2013-10-29'::date) AND ((inactive_on IS NULL) OR (inactive_on > '2013-10-29'::date)) AND (frozen_activation IS NULL)) OR ((frozen_activation)::text = 'active'::text)))
                      Rows Removed by Filter: 9294
                      ->  BitmapOr  (cost=2135.97..2135.97 rows=80823 width=0) (actual time=22.530..22.530 rows=0 loops=1)
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..643.10 rows=6861 width=0) (actual time=16.106..16.106 rows=96993 loops=1)
                                  Index Cond: (((target_type)::text = 'Company'::text) AND (target_id = 173) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..4.77 rows=12 width=0) (actual time=0.033..0.033 rows=0 loops=1)
                                  Index Cond: (((target_type)::text = 'User'::text) AND (target_id = 11654) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_tt_ti_cfc_cfv_ti_s  (cost=0.00..11.68 rows=112 width=0) (actual time=0.238..0.238 rows=1200 loops=1)
                                  Index Cond: (((target_type)::text = 'UserGroup'::text) AND (target_id = 126) AND ((section)::text = 'shop'::text))
                            ->  Bitmap Index Scan on index_access_rights_on_target_type  (cost=0.00..1450.21 rows=73837 width=0) (actual time=6.148..6.148 rows=73837 loops=1)
                                  Index Cond: ((target_type)::text = 'Department'::text)
                ->  Hash  (cost=24.34..24.34 rows=1034 width=12) (actual time=0.331..0.331 rows=1034 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 45kB
                      ->  Seq Scan on departments  (cost=0.00..24.34 rows=1034 width=12) (actual time=0.004..0.179 rows=1034 loops=1)
          ->  Index Scan using tickets_pkey on tickets  (cost=0.42..5.13 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=103609)
                Index Cond: (id = access_rights.ticket_id)
                Filter: (((hold_until IS NULL) OR (hold_until <= '2013-10-29 00:00:00'::timestamp without time zone)) AND (company_id = 173))
                Rows Removed by Filter: 0
    ->  Index Scan using events_pkey on events  (cost=0.42..4.78 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=95959)
          Index Cond: (id = tickets.event_id)
          Filter: ((NOT activity) AND ((canceled_at IS NULL) OR (canceled_at > '2013-10-29 23:11:37.486572'::timestamp without time zone)))
          Rows Removed by Filter: 0
Total runtime: 535.165 ms

我们有17GB内存

此查询的重点是查找具有用户可以访问商店的票证的事件。可以以各种方式确定访问。如果用户是具有对给定票证的访问权限的部门的一部分,则用户部门是具有访问权限的部门的父级(嵌套的集合lft,rgt等)。如果为整个公司提供了对这些票证的access_right,则用户可以访问。用户可以是具有访问权限的UserGroup的一部分。可以为用户授予对票证的单独访问权限。用户公司必须拥有票证。门票可以“冻结”或“不活动”,在这种情况下,用户将无法访问。如果“active_on”&gt;票证处于非活动状态今天或“inactive_on”&lt;今天。门票不可用,如果他们为tickets.hold_until&gt;今天

我正在运行的查询是

EXPLAIN ANALYZE
SELECT count(*) AS count_all
FROM "events"
INNER JOIN tickets ON events.id = tickets.event_id
INNER JOIN access_rights ON access_rights.ticket_id = tickets.id
LEFT OUTER JOIN departments ON departments.id = access_rights.target_id
     AND access_rights.target_type = 'Department'
WHERE ((("events"."activity" = 'f') AND (events.canceled_at IS NULL OR events.canceled_at > '2013-10-29 23:11:37.486572'))
AND ((((((access_rights.section = 'shop') AND (access_rights.target_type = 'Company'
AND access_rights.target_id = 173)) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'User' AND access_rights.target_id = 11654)) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'UserGroup'
AND access_rights.target_id IN ('126'))) OR ((access_rights.section = 'shop')
AND (access_rights.target_type = 'Department'
AND departments.lft <= 7 AND departments.rgt >= 8))) 
AND ((access_rights.section = 'shop')
AND ((((access_rights.section = 'shop')
AND (access_rights.active_on IS NOT NULL
AND access_rights.active_on <= '2013-10-29'
AND (access_rights.inactive_on IS NULL OR access_rights.inactive_on > '2013-10-29')))
AND (access_rights.frozen_activation IS NULL)) OR ((access_rights.section = 'shop')
AND (access_rights.frozen_activation = 'active')))))
AND (tickets.hold_until IS NULL OR tickets.hold_until <= '2013-10-29'))
AND (tickets.company_id = 173)));

表格:

CREATE TABLE tickets (
    hold_until timestamp without time zone,
    event_id integer,
    id integer NOT NULL
 );

Indexes:
    "tickets_pkey" PRIMARY KEY, btree (id)
    "index_tickets_on_company_id" btree (company_id)
    "index_tickets_on_created_at" btree (created_at)
    "index_tickets_on_creation_id" btree (creation_id)
    "index_tickets_on_event_id" btree (event_id)
    "index_tickets_on_hold_until" btree (hold_until)

Foreign-key constraints:
    "tickets_attendee_id_fk" FOREIGN KEY (attendee_id) REFERENCES attendees(id)
    "tickets_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)
    "tickets_event_id_fk" FOREIGN KEY (event_id) REFERENCES events(id)

CREATE TABLE events (
     id integer NOT NULL,
     activity boolean DEFAULT false NOT NULL
 );

Indexes:
    "events_pkey" PRIMARY KEY, btree (id)
    "index_events_on_id_and_te_id" UNIQUE, btree (id, te_id)
    "index_events_on_activity" btree (activity)
    "index_events_on_canceled_at" btree (canceled_at)
    "index_events_on_company_id" btree (company_id)
    "index_events_on_name" btree (name)
    "index_events_on_occurs_at" btree (occurs_at)

Foreign-key constraints:
    "events_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)

CREATE TABLE departments (
   id integer NOT NULL,
   parent_id integer,
   lft integer NOT NULL,
   rgt integer NOT NULL
);

Indexes:
   "departments_pkey" PRIMARY KEY, btree (id)
   "index_departments_on_company_id_and_parent_id_and_name" UNIQUE, btree (company_id, parent_id, name)
   "index_departments_on_company_id" btree (company_id)
   "index_departments_on_lft" btree (lft)
   "index_departments_on_name" btree (name)
   "index_departments_on_parent_id" btree (parent_id)
   "index_departments_on_rgt" btree (rgt)

Foreign-key constraints:
   "departments_company_id_fk" FOREIGN KEY (company_id) REFERENCES companies(id)

CREATE TABLE access_rights (
   id integer NOT NULL,
   target_type character varying(255) NOT NULL,
   target_id integer NOT NULL,
   ticket_id integer NOT NULL,
   active_on date,
   visible boolean,
   inactive_on date,
   frozen_activation character varying(255)
);

Indexes:
   "access_rights_pkey" PRIMARY KEY, btree (id)
   "index_access_rights_on_tt_ti_cfc_cfv_ti_s" UNIQUE, btree (target_type, target_id, custom_field_condition, custom_field_value, ticket_id, section)
   "index_access_rights_on_active_on" btree (active_on)
   "index_access_rights_on_custom_field_value" btree (custom_field_value)
   "index_access_rights_on_frozen_activation" btree (frozen_activation)
   "index_access_rights_on_inactive_on" btree (inactive_on)
   "index_access_rights_on_section" btree (section)
   "index_access_rights_on_target_id" btree (target_id)
   "index_access_rights_on_target_type" btree (target_type)
   "index_access_rights_on_target_type_and_target_id" btree (target_type, target_id) CLUSTER
   "index_access_rights_on_ticket_id" btree (ticket_id)
   "index_access_rights_on_visible" btree (visible)

Foreign-key constraints:
   "access_rights_ticket_id_fk" FOREIGN KEY (ticket_id) REFERENCES tickets(id)

我知道这很多,感谢花时间仔细研究

1 个答案:

答案 0 :(得分:3)

服务器配置

很明显:默认设置非常保守,适用于开箱即用资源有限的小型安装。对于专用数据库服务器,某些默认设置是不合适的。你必须调整你的设置。

首先,如果您有足够的RAM来缓存全部或大部分数据库,请将random_page_cost设置得更低。并增加CPU操作的相对成本。有点像(这是纯粹的猜测!):

seq_page_cost: 1
random_page_cost: 1.2
cpu_tuple_cost: .02
cpu_index_tuple_cost: .02
cpu_operator_cost: .005

effective_cache_size经常太低。对于专用数据库服务器,这可能高达总RAM的四分之三。

@Craig汇总了很多关于性能调整的建议:
Optimise PostgreSQL for fast testing

The Postgres Wiki has even more.

查询

太多的冗余括号,难以阅读。在尝试调试之前使用表别名和格式 - 更不用说向公众呈现。解开后:

SELECT count(*) AS count_all
FROM   events           e
JOIN   tickets          t ON t.event_id = e.id
JOIN   access_rights    a ON a.ticket_id = t.id
LEFT   JOIN departments d ON d.id = a.target_id
                         AND a.target_type = 'Department'
WHERE  e.activity = 'f'
AND   (e.canceled_at IS NULL OR e.canceled_at > '2013-10-29 23:11:37')

AND   (t.hold_until IS NULL OR t.hold_until <= '2013-10-29')
AND    t.company_id = 173;

AND    a.section = 'shop'
AND   (a.target_type = 'Company'   AND a.target_id = 173
   OR  a.target_type = 'User'      AND a.target_id = 11654
   OR  a.target_type = 'UserGroup' AND a.target_id IN (126)
   OR                                  d.lft <= 7 AND d.rgt >= 8
    -- a.target_type = 'Department' is redundant
) 
AND   (a.frozen_activation = 'active'
   OR     a.active_on <= '2013-10-29'
     AND (a.inactive_on IS NULL OR a.inactive_on > '2013-10-29')
     AND  a.frozen_activation IS NULL
)

重点

  • 冗余:AND a.active_on IS NOT NULL,因为您还有AND a.active_on <= '2013-10-29'

  • AND a.target_id IN ('126')应为AND a.target_id = 126或至少为AND a.target_id IN (126)(数字常量)。

  • a.target_type = 'Department'是多余的,因为它已经在LEFT JOIN

  • AND a.section = 'shop'多次多余。

  • target_type_id最有可能是enuminteger引用表target_type而不是varchar(255)

    CREATE TABLE access_rights (
       ...
      ,target_type_id integer NOT NULL REFERENCES target_type(target_type_id)
       ...
    );
    

    类似于a.frozen_activationa.section

这也会使我建议的指数更有效。

指数

添加一些多列/部分索引。定制自己,我不知道基数和数据分布。请注意战略位置的DESC条款。

CREATE INDEX e_idx ON events (company_id, event_id, hold_until)
WHERE activity = FALSE;

CREATE INDEX t_idx ON tickets (company_id, event_id, hold_until DESC);

CREATE INDEX a_idx1 ON access_rights (target_type_id, target_id)
WHERE section = 'shop';

CREATE INDEX a_idx2 ON access_rights
                   (frozen_activation, active_on DESC, inactive_on)
WHERE section = 'shop';

CREATE INDEX d_idx ON departments (target_type, lft DESC, rgt);

除此之外,您只需要外键上的主键和索引。您显示的所有其他索引对此查询都没用。如果其他地方不需要,请删除一些。

有关如何定制这些指数的详细信息,请考虑dba.SE上的相关答案: