在加入表2之前从表1中选择临时表有助于查询运行得更快吗?

时间:2018-03-28 19:22:12

标签: postgresql postgresql-9.4

我的表1有50,000,000行

我把它加入到表2中,有100,000,000行

查询运行缓慢。这并不令我感到惊讶,因为两张桌子都很大。

当我查看表1时,我发现我只需要39,000行。

如果我要在临时表中选择这39,000行,那么将临时表与表2连接起来,它会帮助我的查询运行得更快吗?

Postgresql是否以某种方式优化我的查询效果?

我的字段连接已编制索引,但我的某些where子句字段未编入索引。

我的表索引,计数和当前慢查询都在下面。

谢谢!

 schemaname | tablename  |            indexname            | tablespace |                                         indexdef
------------+------------+---------------------------------+------------+-------------------------------------------------------------------------------------------
 schema1    | table1     | idx_table1_ind1                 |            | CREATE INDEX idx_table1_ind1 ON table1 USING btree(field2)
 schema1    | table1     | pk_table1                       |            | CREATE UNIQUE INDEX pk_table1 ON table1 USING btree (id)
(5 rows)

iii=> select * from pg_indexes where tablename = 'table2';
 schemaname | tablename |               indexname               | tablespace |                                                indexdef
------------+-----------+---------------------------------------+------------+--------------------------------------------------------------------------------------------------------
 schema2    | table2    | idx_table2_ind1                       |            | CREATE INDEX idx_table2_ind1 ON table2 USING btree (field3)
 schema2    | table2    | idx_table2_ind3                       |            | CREATE INDEX idx_table2_ind3 ON table2 USING btree (field6)
 schema2    | table2    | pk_table2                             |            | CREATE UNIQUE INDEX pk_table2 ON table2 USING btree (id)
(5 rows)

iii=> select count(id) from table1;
  count
----------
 50,442,468
(1 row)

iii=> select count(id) from table2;
   count
-----------
 107,978,483
(1 row)

select
table2.field1,
count(*)
from table1
left join table2
on table1.field2 = table2.field3
where 1=1
and extract(year from cast(table1.field4 as date)) = 2018
and extract(month from cast(table1.field4 as date)) = 1
and table1.field5 = 'foo'
and table2.field6 = 'bar'
group by table2.field1

解释结果:

  QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2945968.78..2945968.88 rows=10 width=40) (actual time=22526.520..22526.543 rows=10 loops=1)
   ->  HashAggregate  (cost=2945968.77..2945974.93 rows=616 width=40) (actual time=22526.515..22526.531 rows=11 loops=1)
         Group Key: varfield.field_content
         ->  Nested Loop  (cost=0.57..2945965.69 rows=616 width=40) (actual time=15151.373..22343.986 rows=102148 loops=1)
               ->  Seq Scan on circ_trans  (cost=0.00..2870045.85 rows=59 width=8) (actual time=15151.324..17935.922 rows=102147 loops=1)
                     Filter: ((item_agency_code_num = 9) AND (date_part('year'::text, ((transaction_gmt)::date)::timestamp without time zone) = 2018::double precision) AND (date_part('month'::text, ((transaction_gmt)::date)::timestamp without time zone) = 1::double precision))
                     Rows Removed by Filter: 50371955
               ->  Index Scan using idx_varfield_record_id on varfield  (cost=0.57..1286.68 rows=10 width=48) (actual time=0.014..0.040 rows=1 loops=102147)
                     Index Cond: (record_id = circ_trans.bib_record_id)
                     Filter: ((marc_tag)::text = '245'::text)
                     Rows Removed by Filter: 26
 Planning time: 0.528 ms
 Execution time: 22527.076 ms
(13 rows)

0 个答案:

没有答案