我有一个分区表,并在其上创建了唯一索引。
我正在尝试运行一些查询,其中一些使用主键约束,而另一些使用我创建的索引。我希望查询使用唯一索引而不是主要约束。
我尝试重新编制索引,但是没有用。
这是两个查询
1)在这里,我创建的索引正在被使用。
查询计划是:
Finalize Aggregate (cost=296958.94..296958.95 rows=1 width=8) (actual time=927.948..927.948 rows=1 loops=1)
-> Gather (cost=296958.72..296958.93 rows=2 width=8) (actual time=927.887..933.730 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=295958.72..295958.73 rows=1 width=8) actual time=924.885..924.885 rows=1 loops=3)
-> Parallel Append (cost=0.68..293370.57 rows=1035261 width=8) (actual time=0.076..852.758 rows=825334 loops=3)
-> Parallel Index Only Scan using testdate2019jan_april_cost_mo_user_id_account_id_resource__idx5 on testdate2019jan_april_cost_mod3rem2 (cost=0.68..146591.56 rows=525490 width=8) (actual time=0.082..388.130 rows=421251 loops=3)
Index Cond: (user_id = 1)
Heap Fetches: 3922
-> Parallel Index Only Scan using testdate2018sept_dec_cost_mod_user_id_account_id_resource__idx5 on testdate2018sept_dec_cost_mod3rem2 (cost=0.68..141570.15 rows=509767 width=8) (actual time=0.057..551.572 rows=606125 loops=2)
Index Cond: (user_id = 1)
Heap Fetches: 0
-> Parallel Index Scan using testdate2018jan_april_cost_mo_account_id_user_id_resource__idx2 on testdate2018jan_april_cost_mod3rem2 (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (user_id = 1)
-> Parallel Index Scan using testdate2018may_august_cost_m_account_id_user_id_resource__idx1 on testdate2018may_august_cost_mod3rem2 (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (user_id = 1)
-> Parallel Index Scan using testdate2019may_august_cost_m_account_id_user_id_resource__idx2 on testdate2019may_august_cost_mod3rem2 (cost=0.12..8.14 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=1)
Index Cond: (user_id = 1)
-> Parallel Index Scan using testdate2019sept_dec_cost_mod_account_id_user_id_resource__idx2 on testdate2019sept_dec_cost_mod3rem2 (cost=0.12..8.14 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)
Index Cond: (user_id = 1)
Planning Time: 0.754 ms
Execution Time: 933.797 ms
在上面的查询中,我的索引testdate2018may_august_cost_m_account_id_user_id_resource__idx1
的使用如我所愿。
2)这里没有使用我创建的索引,而是使用了主要约束。
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=2 read=66080
-> Finalize GroupAggregate (cost=388046.40..388187.55 rows=6 width=61) (actual time=510.710..513.262 rows=10 loops=1)
Group Key: c_1.instance_type, c_1.currency
Buffers: shared hit=2 read=66080
-> Gather Merge (cost=388046.40..388187.24 rows=12 width=85) (actual time=510.689..513.303 rows=28 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=26 read=206407
-> Partial GroupAggregate (cost=387046.38..387185.83 rows=6 width=85) (actual time=504.731..507.277 rows=9 loops=3)
Group Key: c_1.instance_type, c_1.currency
Buffers: shared hit=26 read=206407
-> Sort (cost=387046.38..387056.71 rows=4130 width=36) (actual time=504.694..504.933 rows=3895 loops=3)
Sort Key: c_1.instance_type, c_1.currency
Sort Method: quicksort Memory: 404kB
Worker 0: Sort Method: quicksort Memory: 354kB
Worker 1: Sort Method: quicksort Memory: 541kB
Buffers: shared hit=20 read=206407
-> Parallel Append (cost=0.13..386798.33 rows=4130 width=36) (actual time=0.081..501.720 rows=3895 loops=3)
Buffers: shared hit=6 read=206405
Subplans Removed: 3
-> Parallel Index Scan using testdate2019may_august_cost_mod3rem2_pkey on testdate2019may_august_cost_mod3rem2 c_1 (cost=0.13..8.15 rows=1 width=36) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
Buffers: shared hit=1
-> Parallel Index Scan using testdate2019sept_dec_cost_mod3rem2_pkey on testdate2019sept_dec_cost_mod3rem2 c_2 (cost=0.13..8.15 rows=1 width=36) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
Buffers: shared hit=1
-> Parallel Seq Scan on testdate2019jan_april_cost_mod3rem2 c (cost=0.00..258266.58 rows=4125 width=36) (actual time=0.076..501.060 rows=3895 loops=3)
Filter: ((instance_type IS NOT NULL) AND (user_id = '1'::bigint) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE) AND (usage_start_date >= (CURRENT_DATE - 30)))
Rows Removed by Filter: 1504689
Buffers: shared hit=4 read=206405
Planning Time: 1.290 ms
Execution Time: 513.439 ms
在上面的查询testdate2019sept_dec_cost_mod3rem2_pkey
中,它是主要的约束条件。
我希望它使用我创建的索引而不是主要约束。 根据分区,我的第二个查询计划是否正确?
表创建查询:
CREATE TABLE a2i.testawscost_line_item (
line_item_id uuid NOT NULL,
account_id character varying(255) COLLATE pg_catalog."default",
availability_zone character varying(255) COLLATE pg_catalog."default",
base_cost double precision,
base_rate double precision,
cost double precision,
currency character varying(255) COLLATE pg_catalog."default",
instance_family character varying(255) COLLATE pg_catalog."default",
instance_type character varying(255) COLLATE pg_catalog."default",
line_item_type character varying(255) COLLATE pg_catalog."default",
operating_system character varying(255) COLLATE pg_catalog."default",
operation character varying(255) COLLATE pg_catalog."default",
payer_account_id character varying(255) COLLATE pg_catalog."default",
product_code character varying(255) COLLATE pg_catalog."default",
product_family character varying(255) COLLATE pg_catalog."default",
product_group character varying(255) COLLATE pg_catalog."default",
product_name character varying(255) COLLATE pg_catalog."default",
rate double precision,
rate_description character varying(255) COLLATE pg_catalog."default",
reservation_id character varying(255) COLLATE pg_catalog."default",
resource_type character varying(255) COLLATE pg_catalog."default",
sku character varying(255) COLLATE pg_catalog."default",
tax_type character varying(255) COLLATE pg_catalog."default",
unit character varying(255) COLLATE pg_catalog."default",
usage_end_date timestamp without time zone,
usage_quantity double precision,
usage_start_date timestamp without time zone,
usage_type character varying(255) COLLATE pg_catalog."default",
user_id bigint,
resource_id character varying(255) COLLATE pg_catalog."default",
CONSTRAINT testawscost_line_item_pkey PRIMARY KEY
(line_item_id, usage_start_date, user_id),
CONSTRAINT fkptp4hyur3i4yj88wo3rxnaf05 FOREIGN KEY (resource_id)
REFERENCES a2i.awscost_resource (resource_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
) PARTITION BY hash(user_id);
分区:
create table a2i.testuser_cost_mod3rem0
partition of a2i.testawscost_line_item
for values with (MODULUS 3, REMAINDER 0)
partition by range(usage_start_date);
create table a2i.testuser_cost_mod3rem1
partition of a2i.testawscost_line_item
for values with (MODULUS 3, REMAINDER 1)
partition by range(usage_start_date);
create table a2i.testuser_cost_mod3rem2
partition of a2i.testawscost_line_item
for values with (MODULUS 3, REMAINDER 2)
partition by range(usage_start_date);
2019年分区的分区:
create table a2i.testdate2019jan_april_cost_mod3rem0
partition of a2i.testuser_cost_mod3rem0
for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');
create table a2i.testdate2019may_august_cost_mod3rem0
partition of a2i.testuser_cost_mod3rem0
for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');
create table a2i.testdate2019sept_dec_cost_mod3rem0
partition of a2i.testuser_cost_mod3rem0
for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');
create table a2i.testdate2019jan_april_cost_mod3rem1
partition of a2i.testuser_cost_mod3rem1
for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');
create table a2i.testdate2019may_august_cost_mod3rem1
partition of a2i.testuser_cost_mod3rem1
for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');
create table a2i.testdate2019sept_dec_cost_mod3rem1
partition of a2i.testuser_cost_mod3rem1
for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');
create table a2i.testdate2019jan_april_cost_mod3rem2
partition of a2i.testuser_cost_mod3rem2
for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');
create table a2i.testdate2019may_august_cost_mod3rem2
partition of a2i.testuser_cost_mod3rem2
for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');
create table a2i.testdate2019sept_dec_cost_mod3rem2
partition of a2i.testuser_cost_mod3rem2
for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');
索引:
CREATE UNIQUE INDEX awscost_line_item_unique_pkey ON a2i.awscost_line_item (
account_id, user_id, resource_id, usage_start_date, usage_end_date, usage_type,
usage_quantity, line_item_type, sku, rate, base_rate, base_cost,
"cost", currency, product_code, operation
);
对于第一个查询计划,查询为:
explain analyze select sum(cost) from testawscost_line_item where
user_id='1';
第二个查询:
explain (analyze/*, buffers*/) SELECT sum (c.cost),
sum (case when c.resource_type = 'Compute' then c.cost end) as computeCost,
sum (case when c.resource_type = 'Storage' then c.cost end) as storageCost,
sum (case when c.resource_type = 'Network' then c.cost end) as networkCost,
sum (case when c.resource_type not in ('Compute', 'Network', 'Storage')
then c.cost end) as otherCost,
c.currency,
c.instance_type as productFamily,
avg (c.rate) FROM testawscost_line_item c WHERE
(c.user_id ='1') AND (c.account_id = '807331824280') AND
(c.usage_start_date >= current_date-30 AND c.usage_end_date <=
current_date) AND
(c.instance_type is not null )
GROUP BY c.instance_type, c.currency
ORDER BY 1 desc
答案 0 :(得分:1)
问题在于您的索引对于第一个查询有效,但不适用于第二个查询。
resource_id
在索引中,但不在查询中,因此之后的所有索引列均不可用于查询。 PostgreSQL决定使用小得多的主键索引。
此查询的理想索引是:
CREATE INDEX ON a2i.testawscost_line_item (user_id, account_id, usage_start_date)
WHERE instance_type IS NOT NULL;
我假设usage_end_date
上的条件比usage_start_date
上的条件具有更高的选择性。