Question

我有一个大表（100M条记录），结构如下。

  length   |          created_at
-----------+-------------------------------
 506225551 | 2018-12-29 02:08:34.116618
 133712971 | 2018-10-19 21:20:14.568936
 608443439 | 2018-12-14 03:22:55.141416
 927160571 | 2019-01-30 00:51:41.639126
 407033524 | 2018-11-16 21:26:41.523047
 506008096 | 2018-11-17 00:07:42.839919
 457719749 | 2018-11-12 02:32:53.116225

0 < length < 1000000000
'2017-01-01' < created_at < '2019-02-01'
数据length和created_at均匀分布。

我想运行这样的查询

SELECT * FROM tbl WHERE length BETWEEN 2000000 and 3000000 ORDER BY  created_at DESC

在2000000至3000000之间有10万个结果，因此我想使用索引进行选择和排序。

我尝试了这些方法

1。简单的BTREE索引

create index on tbl(length);

这对于length的短距离有效，但是我不能使用该索引来排序记录。

2。多列BTREE索引

 create index on tbl(length, created_at);

该索引我只能用于这样的查询

 SELECT * FROM tbl WHERE length = 2000000 ORDER BY  created_at DESC

3。 GIST索引，扩展名为btree_gist。我希望该索引可以正常工作。

create index on tbl using gist(length, created_at);

但是没有。即使对于像这样的简单查询，我也无法使用此索引。

test=# explain analyze select * from gist_test where a = 345 order by c desc;

                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=25706.37..25730.36 rows=9597 width=12) (actual time=4.839..5.568 rows=10000 loops=1)
   Sort Key: c DESC
   Sort Method: quicksort  Memory: 853kB
   ->  Bitmap Heap Scan on gist_test  (cost=370.79..25071.60 rows=9597 width=12) (actual time=1.402..2.869 rows=10000 loops=1)
         Recheck Cond: (a = 345)
         Heap Blocks: exact=152
         ->  Bitmap Index Scan on gist_test_a_b_c_idx  (cost=0.00..368.39 rows=9597 width=0) (actual time=1.384..1.384 rows=10000 loops=1)
               Index Cond: (a = 345)
 Planning time: 0.119 ms
 Execution time: 6.271 ms

我只能将此索引用作一列上的简单BTREE。

那么，我该如何解决这个问题？

也许没有SQL数据库可以处理这种查询？

Answer 1

我认为这是不可能的（至少在普通postgresql中，我不知道可以对此有所帮助的扩展名）。仅因为索引已生成排序记录，才可以跳过对记录排序的步骤。
但是：

如doc中所述，只能将B树索引用于排序（这很有意义，它是使用搜索树实现的）。
您的where和您的order by对于B树索引不兼容：
- 由于同时具有这两个子句，因此需要在索引(A, B)中放入2列
- 索引中的数据按(A, B)排序，因此它也按A排序（这就是为什么当where打开时，PostgreSQL可以快速对表进行索引扫描的原因。仅限A），但因此，它不会在索引中按B进行不排序（仅在{{ 1}}是常量，但不能遍及整个表格。
- 您可能已经知道，由于B，仅在A上建立索引将无济于事。

提供的示例＃2显示，对于您根据B的单个值进行过滤的情况，postgresql进行了优化。

如果无法对2列where进行排序，那么恐怕您不应该期望超出此范围。

BETWEEN和ORDER BY

1 个答案: