Question

我有以下结构：

create table bitmex
(
  timestamp timestamp with time zone not null,
  symbol    varchar(255)             not null,
  side      varchar(255)             not null,
  tid       varchar(255)             not null,
  size      numeric                  not null,
  price     numeric                  not null,
  constraint bitmex_tid_symbol_pk
  primary key (tid, symbol)
);

create index bitmex_timestamp_symbol_index  on bitmex (timestamp, symbol);
create index bitmex_symbol_index  on bitmex (symbol);

我每次都需要知道数量的准确值。所以reltuples不可用。

该表有超过45,000,000行。

运行

explain analyze select count(*) from bitmex where symbol = 'XBTUSD';

给出

Finalize Aggregate  (cost=1038428.56..1038428.57 rows=1 width=8)
  ->  Gather  (cost=1038428.35..1038428.56 rows=2 width=8)
        Workers Planned: 2
        ->  Partial Aggregate  (cost=1037428.35..1037428.36 rows=1 width=8)
              ->  Parallel Seq Scan on bitmex  (cost=0.00..996439.12 rows=16395690 width=0)
                    Filter: ((symbol)::text = 'XBTUSD'::text)

运行

explain analyze select count(*) from bitmex;

给出

Finalize Aggregate  (cost=997439.34..997439.35 rows=1 width=8) (actual time=6105.463..6105.463 rows=1 loops=1)
  ->  Gather  (cost=997439.12..997439.33 rows=2 width=8) (actual time=6105.444..6105.457 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Partial Aggregate  (cost=996439.12..996439.14 rows=1 width=8) (actual time=6085.960..6085.960 rows=1 loops=3)
              ->  Parallel Seq Scan on bitmex  (cost=0.00..954473.50 rows=16786250 width=0) (actual time=0.364..4342.460 rows=13819096 loops=3)
Planning time: 0.080 ms
Execution time: 6108.277 ms

为什么它没有使用索引？感谢

Answer 1

如果必须访问所有行，如果不必查询索引中找到的大多数值，则索引扫描只会更便宜。

由于PostgreSQL的组织方式，必须访问该表以确定索引中找到的条目是否可见。如果整个页面在表格的可见性地图中标记为“可见”，则可以跳过此步骤。

要更新可见性图，请在表格上运行VACUUM。也许然后将使用仅索引扫描。

但是计算表中的行数从不便宜，即使使用索引扫描也是如此。如果您需要经常这样做，最好有一个单独的表，其中只包含行数的计数器。然后，您可以编写在插入或删除行时更新计数器的触发器。

这会降低INSERT和DELETE期间的效果，但您可以以闪电般的速度计算行数。

PostgreSQL的。改善指数

1 个答案: