x86_64-pc-linux-gnu上的PostgreSQL 9.6.3,由gcc(Debian 4.9.2-10)4.9.2,64位编译
表和索引:
create table if not exists orders
(
id bigserial not null constraint orders_pkey primary key,
partner_id integer,
order_id varchar,
date_created date,
state_code integer,
state_date timestamp,
recipient varchar,
phone varchar,
);
create index if not exists orders_partner_id_index on orders (partner_id);
create index if not exists orders_order_id_index on orders (order_id);
create index if not exists orders_partner_id_date_created_index on orders (partner_id, date_created);
任务是创建分页/排序/过滤数据。
对第一页的查询:
select order_id, date_created, recipient, phone, state_code, state_date
from orders
where partner_id=1 and date_created between '2019-04-01' and '2019-04-30'
order by order_id asc limit 10 offset 0;
查询计划:
QUERY PLAN
"Limit (cost=19495.48..38990.41 rows=10 width=91)"
" -> Index Scan using orders_order_id_index on orders (cost=0.56..**41186925.66** rows=21127 width=91)"
" Filter: ((date_created >= '2019-04-01'::date) AND (date_created <= '2019-04-30'::date) AND (partner_id = 1))"
未使用索引orders_partner_id_date_created_index,因此成本非常高!
但是从某些偏移值开始(确切的值有时会有所不同,看起来取决于行总数),索引开始使用:
select order_id, date_created, recipient, phone, state_code, state_date
from orders
where partner_id=1 and date_created between '2019-04-01' and '2019-04-30'
order by order_id asc limit 10 offset 40;
计划:
QUERY PLAN
"Limit (cost=81449.76..81449.79 rows=10 width=91)"
" -> Sort (cost=81449.66..81502.48 rows=21127 width=91)"
" Sort Key: order_id"
" -> Bitmap Heap Scan on orders (cost=4241.93..80747.84 rows=21127 width=91)"
" Recheck Cond: ((partner_id = 1) AND (date_created >= '2019-04-01'::date) AND (date_created <= '2019-04-30'::date))"
" -> Bitmap Index Scan on orders_partner_id_date_created_index (cost=0.00..4236.65 rows=21127 width=0)"
" Index Cond: ((partner_id = 1) AND (date_created >= '2019-04-01'::date) AND (date_created <= '2019-04-30'::date))"
发生了什么事?这是强制服务器使用索引的方法吗?
答案 0 :(得分:3)
一般答案:
我可以想到两种解决方案:
A)通过运行向刨床提供更多数据
ANALYZE orders;
(https://www.postgresql.org/docs/9.6/sql-analyze.html)
或者bo更改收集的统计信息
ALTER TABLE orders SET STATISTCS (...);
(https://www.postgresql.org/docs/9.6/planner-stats.html)
B)以暗示所需索引使用方式的方式重写查询,如下所示:
WITH
partner_date (partner_id, date_created) AS (
SELECT 1,
generate_series('2019-04-01'::date, '2019-04-30'::date, '1 day'::interval)::date
)
SELECT o.order_id, o.date_created, o.recipient, o.phone, o.state_code, o.state_date
FROM orders o
JOIN partner_date pd
ON (o.partner_id, o.date_created) = (pd.partner_id, pd.date_created)
ORDER BY order_id ASC LIMIT 10 OFFSET 0;
也许更好:
WITH
partner_date (partner_id, date_created) AS (
SELECT 1,
generate_series('2019-04-01'::date, '2019-04-30'::date, '1 day'::interval)::date
),
all_data AS (
SELECT o.order_id, o.date_created, o.recipient, o.phone, o.state_code, o.state_date
FROM orders o
JOIN partner_date pd
ON (o.partner_id, o.date_created) = (pd.partner_id, pd.date_created)
)
SELECT *
FROM all_data
ORDER BY order_id ASC LIMIT 10 OFFSET 0;
免责声明-我无法解释为什么Postgres规划器应该以其他方式解释第一个查询,只是认为可以。另一方面,第二个查询从联接中分离了偏移量/限制,如果Postgres仍然以“不良”(根据您的基准)方式进行操作,我会感到非常惊讶。