我有一个查询,当输入参数是给定长度时将使用表的索引,但是当参数是任何其他长度时,它不会。
此查询将正确使用表索引:
EXPLAIN SELECT *
FROM equipment_tests et
INNER JOIN equipments e ON e.id = et.equipment_id
WHERE e.organization_id = '6c93a9b5-cde7-4660-a55a-1ba74b97fc58'
LIMIT 100;
其查询计划:
Limit (cost=0.84..40.75 rows=100 width=294) (actual time=357.878..366.848 rows=100 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at, e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
-> Merge Join (cost=0.84..232724.95 rows=583224 width=294) (actual time=357.874..366.647 rows=100 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at, e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
Merge Cond: ((e.id)::text = (et.equipment_id)::text)
-> Index Scan using equip_id on public.equipments e (cost=0.42..14251.87 rows=33030 width=134) (actual time=0.045..0.045 rows=1 loops=1)
Output: e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
Filter: ((e.organization_id)::text = '6c93a9b5-cde7-4660-a55a-1ba74b97fc58'::text)
Rows Removed by Filter: 5
-> Index Scan using equip_tests_equip_id on public.equipment_tests et (cost=0.43..208750.41 rows=1525051 width=160) (actual time=0.005..173.042 rows=73224 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at
Total runtime: 366.989 ms
此查询不会使用equipment_test.equipment_id索引:
EXPLAIN SELECT *
FROM equipment_tests et
INNER JOIN equipments e ON e.id = et.equipment_id
WHERE e.organization_id = '6c93a9b5-cde7-4660-a55a-1ba74b97fc5'
LIMIT 100;
其查询计划:
Limit (cost=50.06..14630.82 rows=100 width=294) (actual time=0.043..0.043 rows=0 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at, e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
-> Hash Join (cost=50.06..56623.39 rows=388 width=294) (actual time=0.040..0.040 rows=0 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at, e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
Hash Cond: ((et.equipment_id)::text = (e.id)::text)
-> Seq Scan on public.equipment_tests et (cost=0.00..50850.51 rows=1525051 width=160) (actual time=0.004..0.004 rows=1 loops=1)
Output: et.id, et.equipment_id, et.test_id, et.equipment_config_id, et.context, et.created_at
-> Hash (cost=49.79..49.79 rows=22 width=134) (actual time=0.027..0.027 rows=0 loops=1)
Output: e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
Buckets: 1024 Batches: 1 Memory Usage: 0kB
-> Index Scan using equip_organization on public.equipments e (cost=0.42..49.79 rows=22 width=134) (actual time=0.025..0.025 rows=0 loops=1)
Output: e.id, e.organization_id, e.type, e.model, e.serial_number, e.version, e.calibration_date, e.created_at, e.updated_at, e.hw_version
Index Cond: ((e.organization_id)::text = '6c93a9b5-cde7-4660-a55a-1ba74b97fc5'::text)
Total runtime: 0.088 ms
请注意,我所做的只是将organization_id设为参数一个字符缩短。
我们的架构:
Table "equipment_tests"
Column | Type | Modifiers
---------------------+-----------------------------+-----------
id | character varying | not null
equipment_id | character varying | not null
test_id | character varying | not null
equipment_config_id | character varying | not null
created_at | timestamp without time zone | not null
Indexes:
"equipment_tests_pkey" PRIMARY KEY, btree (id)
"equipment_tests_test_config_context" UNIQUE, btree (test_id, equipment_config_id)
"equip_tests_equip_id" btree (equipment_id)
Table "equipments"
Column | Type | Modifiers
------------------+-----------------------------+-----------
id | character varying | not null
organization_id | character varying | not null
type | integer |
model | character varying |
serial_number | character varying |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
Indexes:
"equipments_pkey" PRIMARY KEY, btree (id)
"equip_organization" btree (organization_id)
"equipment_org_model_sn" btree (organization_id, model, serial_number)
通常,PK是UUID,但是有一些遗留数据,其中ID可以是较短的随机字符集(大约22个随机字母字符)。当我们使用这些ID(比UUID短)查询时,我们是否发现PG没有使用equipment_id索引,而是进行表扫描。
equipment_tests表大约有40M行。设备表大约有1M行。
我们正在使用Postgres 9.3.6
我们认为这可能与以下事实有关:此列中的大多数数据是一个长度,少数数据的长度较短,但我不确定调试的下一步是什么应该是吗?