如何在postgres中创建地图样式的索引?

时间:2018-07-08 16:45:43

标签: postgresql

我想创建一个地图样式的索引,例如Golang中的map或Javascript中的关联数组。我需要映射的键为account_id,映射的value为记录的有序列表。可能吗?我发现Postgres具有expression索引,但是我不知道如何从具有OR条件的表达式中组装映射。

我真实的例子:

我有一个包含帐户转账的表,当前正在使用此查询来获取帐户的最新余额:

SELECT
        valtr_id,
        from_id,
        to_id,
        from_balance,
        to_balance
FROM value_transfer v
WHERE
        (v.block_num<=2435013) AND
        (
                (v.to_id = 22479) OR
                (v.from_id = 22479) 
        )
ORDER BY v.block_num DESC,v.valtr_id DESC LIMIT 1

必须使用OR,因为帐户可能有传出转账(设置了from_id或传入转账(设置了to_id)。如果我有一个关联数组索引,它将保存account_id(将作为条件导出:if from_id==account_id OR to_id=account_id),则Postgres可以使用account_id查找此索引以获取记录列表,这将已经排序。由于索引已经考虑了OR条件,因此我不需要使用from_id=22479来构建记录列表,然后使用to_id=22479进行比较,以比较哪个记录具有最新的时间戳以获取最新的余额。的帐户,就像我现在对当前查询所做的一样。 (block_num是进行转移的区块链区块)

当前,此查询需要花费大量时间来建立具有1亿条记录的庞大数据库,下面是它的EXPLAIN ANALYZE

postgres-> \g
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1592973.24..1592973.24 rows=1 width=31) (actual time=86448.709..86448.710 rows=1 loops=1)
   ->  Sort  (cost=1592973.24..1595439.02 rows=986312 width=31) (actual time=86448.707..86448.707 rows=1 loops=1)
         Sort Key: block_num DESC, valtr_id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Bitmap Heap Scan on value_transfer v  (cost=35340.86..1588041.68 rows=986312 width=31) (actual time=851.598..85082.223 rows=1387411 loops=1)
               Recheck Cond: ((to_id = 22479) OR (from_id = 22479))
               Filter: (block_num <= 2435013)
               Rows Removed by Filter: 298923
               Heap Blocks: exact=274549
               ->  BitmapOr  (cost=35340.86..35340.86 rows=1291543 width=0) (actual time=729.917..729.917 rows=0 loops=1)
                     ->  Bitmap Index Scan on vt_to_id_idx  (cost=0.00..27233.03 rows=1004862 width=0) (actual time=575.558..575.558 rows=1364039 loops=1)
                           Index Cond: (to_id = 22479)
                     ->  Bitmap Index Scan on vt_from_id_idx  (cost=0.00..7614.68 rows=286681 width=0) (actual time=154.356..154.356 rows=352366 loops=1)
                           Index Cond: (from_id = 22479)
 Planning time: 0.367 ms
 Execution time: 86448.817 ms
(16 rows)

postgres=> 

该表的定义如下:

CREATE TABLE value_transfer (
    valtr_id            BIGSERIAL       PRIMARY KEY,
    tx_id               BIGINT          REFERENCES transaction(tx_id) ON DELETE CASCADE ON UPDATE CASCADE,
    block_id            INT             REFERENCES block(block_id) ON DELETE CASCADE ON UPDATE CASCADE,
    block_num           INT             NOT NULL,
    from_id             INT             NOT NULL,
    to_id               INT             NOT NULL,
    value               NUMERIC         DEFAULT 0,
    from_balance        NUMERIC         DEFAULT 0,
    to_balance          NUMERIC         DEFAULT 0,
    kind                CHAR            NOT NULL,
    depth               INT             DEFAULT 0,
    error               TEXT            NOT NULL
);
CREATE INDEX vt_tx_idx          ON  value_transfer  USING   btree   ("tx_id");
CREATE INDEX vt_block_num_idx   ON  value_transfer      USING   btree   ("block_num");
CREATE INDEX vt_block_id_idx    ON  value_transfer      USING   btree   ("block_id");
CREATE INDEX vt_from_id_idx     ON  value_transfer  USING   btree   ("from_id");
CREATE INDEX vt_to_id_idx       ON  value_transfer  USING   btree   ("to_id");

from_idto_id是帐户表的外键:

CREATE TABLE account (
    account_id          SERIAL          PRIMARY KEY,
    owner_id            INT             NOT NULL DEFAULT 0,
    last_balance        NUMERIC         DEFAULT 0,
    num_tx              BIGINT          DEFAULT 0,
    ts_created          INT             DEFAULT 0,
    block_created       INT             DEFAULT 0,
    deleted             SMALLINT        DEFAULT 0,
    block_sd            INT             DEFAULT 0,
    address             TEXT            NOT NULL UNIQUE
);

编辑:

Lukasz提出的UNION查询与旧查询的执行计划比较

UNION查询:

 Limit  (cost=1668089.09..1668089.10 rows=1 width=32) (actual time=6115.484..6115.485 rows=1 loops=1)
   ->  Sort  (cost=1668089.09..1671668.88 rows=1431916 width=32) (actual time=6115.483..6115.483 rows=1 loops=1)
         Sort Key: v.block_num DESC, v.valtr_id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Append  (cost=21229.61..1660929.51 rows=1431916 width=32) (actual time=255.166..5446.818 rows=1413507 loops=1)
               ->  Bitmap Heap Scan on value_transfer v  (cost=21229.61..1229731.99 rows=1134056 width=32) (actual time=255.165..4312.769 rows=1102867 loops=1)
                     Recheck Cond: (to_id = 22479)
                     Rows Removed by Index Recheck: 9412580
                     Filter: (block_num <= 2435013)
                     Heap Blocks: exact=32392 lossy=132879
                     ->  Bitmap Index Scan on vt_to_id_idx  (cost=0.00..20946.10 rows=1134071 width=0) (actual time=241.632..241.632 rows=1102867 loops=1)
                           Index Cond: (to_id = 22479)
               ->  Index Scan using vt_from_id_idx on value_transfer v_1  (cost=0.57..416878.36 rows=297860 width=32) (actual time=0.056..952.883 rows=310640 loops=1)
                     Index Cond: (from_id = 22479)
                     Filter: (block_num <= 2435013)
 Planning time: 0.319 ms
 Execution time: 6115.539 ms
(17 rows)

OR OR CONDITION查询(我的原始查询):

 Limit  (cost=1276124.75..1276124.75 rows=1 width=32) (actual time=7860.439..7860.440 rows=1 loops=1)
   ->  Sort  (cost=1276124.75..1279694.24 rows=1427797 width=32) (actual time=7860.437..7860.437 rows=1 loops=1)
         Sort Key: block_num DESC, valtr_id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Bitmap Heap Scan on value_transfer v  (cost=27162.56..1268985.76 rows=1427797 width=32) (actual time=304.197..7194.825 rows=1387411 loops=1)
               Recheck Cond: ((to_id = 22479) OR (from_id = 22479))
               Rows Removed by Index Recheck: 13260750
               Filter: (block_num <= 2435013)
               Heap Blocks: exact=37782 lossy=186738
               ->  BitmapOr  (cost=27162.56..27162.56 rows=1431937 width=0) (actual time=288.359..288.359 rows=0 loops=1)
                     ->  Bitmap Index Scan on vt_to_id_idx  (cost=0.00..20946.11 rows=1134072 width=0) (actual time=216.708..216.708 rows=1102867 loops=1)
                           Index Cond: (to_id = 22479)
                     ->  Bitmap Index Scan on vt_from_id_idx  (cost=0.00..5502.55 rows=297865 width=0) (actual time=71.649..71.649 rows=310640 loops=1)
                           Index Cond: (from_id = 22479)
 Planning time: 0.257 ms
 Execution time: 7860.481 ms
(16 rows)

使用UNION查询,执行速度提高了1.7秒。

编辑2

这个简单的查询非常快。

EXPLAIN ANALYZE
SELECT
        valtr_id,
        from_id,
        to_id,
        from_balance,
        to_balance,
        block_num
FROM value_transfer v
WHERE v.block_num<=2435013 AND v.from_id = 22479
LIMIT 1

 Limit  (cost=0.57..1.97 rows=1 width=32) (actual time=0.047..0.047 rows=1 loops=1)
   ->  Index Scan using vt_from_id_idx on value_transfer v  (cost=0.57..416878.36 rows=297860 width=32) (actual time=0.045..0.045 rows=1 loops=1)
         Index Cond: (from_id = 22479)
         Filter: (block_num <= 2435013)
 Planning time: 0.392 ms
 Execution time: 0.089 ms
(6 rows)

但如果是OR版,则需要花费更多时间。告诉Postgres在两个查询之间建立UNION一定有问题。也许编写PL / PGSQL会更好

2 个答案:

答案 0 :(得分:1)

我会尝试将其重写为:

--or-expansion
SELECT
        valtr_id,
        from_id,
        to_id,
        from_balance,
        to_balance,
        block_num
FROM value_transfer v
WHERE v.block_num<=2435013 AND v.to_id = 22479
UNION ALL
SELECT
        valtr_id,
        from_id,
        to_id,
        from_balance,
        to_balance,
        block_num
FROM value_transfer v
WHERE v.block_num<=2435013 AND v.from_id = 22479              
ORDER BY block_num DESC,valtr_id DESC LIMIT 1

并添加两个索引:

CREATE INDEX idx_1 ON value_transfer(from_id, block_num DESC);
CREATE INDEX idx_2 ON value_transfer(to_id, block_num DESC);

答案 1 :(得分:0)

请尝试一下,它一定要好得多。

(
SELECT valtr_id,
       from_id,
       to_id,
       from_balance,
       to_balance,
       block_num
  FROM value_transfer v
 WHERE v.block_num<=2435013 AND
       v.to_id = 22479
 ORDER BY block_num DESC,valtr_id DESC
 LIMIT 1
)
 UNION ALL
(
SELECT valtr_id,
       from_id,
       to_id,
       from_balance,
       to_balance,
       block_num
  FROM value_transfer v
 WHERE v.block_num<=2435013 AND 
       v.from_id = 22479              
 ORDER BY block_num DESC,valtr_id DESC
 LIMIT 1
 )
 ORDER BY block_num DESC,valtr_id DESC
 LIMIT 1