Postgres查询花费很长时间

时间:2020-04-18 09:13:35

标签: postgresql

我有此SQL查询需要很长时间(〜12s)来执行:

select nextval('rec_id_seq_af') as rec_id,ds_id,ds_dt_id,
inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq from rs_data_1_af 
group by ds_id,ds_dt_id,inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq 
order by ds_id,ds_dt_id,inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq;

当我跑步时:

EXPLAIN (ANALYZE, BUFFERS)
select nextval('rec_id_seq_af') as rec_id,ds_id,ds_dt_id,
inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq from rs_data_1_af 
group by ds_id,ds_dt_id,inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq 
order by ds_id,ds_dt_id,inv_vcf_id,inv_sample_id,inv_variant_id,ds_dt_line_seq;

这是输出:

Group  (cost=724728.48..780477.07 rows=314077 width=88) (actual time=10395.641..12546.322 rows=5703 loops=1)
  Group Key: ds_id, ds_dt_id, inv_vcf_id, inv_sample_id, inv_variant_id, ds_dt_line_seq
  Buffers: shared hit=80975, temp read=91041 written=91171
  ->  Sort  (cost=724728.48..732580.39 rows=3140766 width=80) (actual time=10395.619..12019.351 rows=3140766 loops=1)
        Sort Key: ds_id, ds_dt_id, inv_vcf_id, inv_sample_id, inv_variant_id, ds_dt_line_seq
        Sort Method: external merge  Disk: 286312kB
        Buffers: shared hit=75272, temp read=91041 written=91171
        ->  Seq Scan on rs_data_1_af  (cost=0.00..106679.66 rows=3140766 width=80) (actual time=0.009..575.729 rows=3140766 loops=1)
              Buffers: shared hit=75272
Planning Time: 0.478 ms
Execution Time: 12581.964 ms

当rs_data_1_af表中的记录增加时,我将有数百万的行,此查询将花费数小时来执行。

如何优化?

1 个答案:

答案 0 :(得分:0)

您为什么要运行nextval('rec_id_seq_af')

nextval()使索引前进并返回下一个值。对于每一行,它都必须这样做,隔离地进行读/写操作,这也就怪怪它慢了。

这里的目的是什么,也许您可​​以使用generate_series()或其他东西?