Postgres获取的索引已经按主键

时间:2017-12-23 20:45:18

标签: sql postgresql sorting indexing

我有论坛帖子的设置,想要使用以下查询检索特定用户创建的帖子:

SELECT * FROM forum.posts WHERE authorid=? ORDER BY postid LIMIT ?

其中authorid已编制索引,postid是群集主键。这是完整的架构:

+--------------+--------------------------+-------------+
| Column       | Type                     | Modifiers   |
|--------------+--------------------------+-------------|
| postid       | integer                  |  not null   |
| postdate     | timestamp with time zone |  not null   |
| postbody     | text                     |  not null   |
| parentthread | integer                  |  not null   |
| parentpage   | integer                  |  not null   |
| authorid     | integer                  |  not null   |
| totalpages   | integer                  |             |
| postsubject  | text                     |             |
| thread       | boolean                  |  not null   |
| subforum     | smallint                 |  not null   |
+--------------+--------------------------+-------------+
Indexes:
    "posts_pkey" PRIMARY KEY, btree (postid) CLUSTER
    "date_index" btree (postdate)
    "forum_index" btree (subforum)
    "page_index" btree (parentpage)
    "parent_index" btree (parentthread)
    "thread_index" btree (thread)
    "user_index" btree (authorid)

然而,对于有大量帖子的用户来说,查询需要花费很长时间,因为它首先使用索引来检索密钥,但是必须再次对它们进行排序。这是一个用户的EXPLAIN ANALYZE:

Limit  (cost=22881.46..22881.53 rows=25 width=139) (actual time=1424.436..1424.451 rows=25 loops=1)
  ->  Sort  (cost=22881.46..22897.09 rows=6250 width=139) (actual time=1424.434..1424.442 rows=25 loops=1)
        Sort Key: postid
        Sort Method: top-N heapsort  Memory: 43kB
        ->  Index Scan using user_index on posts  (cost=0.57..22705.09 rows=6250 width=139) (actual time=2.235..1420.733 rows=3022 loops=1)
              Index Cond: (authorid = ?)
Planning time: 0.114 ms
Execution time: 1424.489 ms

我认为群集会有所帮助,但是帖子太多了,对于帖子较多的用户,它会使用过滤器进行扫描,而不是对索引进行排序。虽然成本很低,但它仍然会永远消失,因为行数太多了:

Limit  (cost=0.57..149978.39 rows=25 width=139) (actual time=205822.311..210766.374 rows=25 loops=1)
  ->  Index Scan using posts_pkey on posts  (cost=0.57..664137787.62 rows=110706 width=139) (actual time=205822.310..210766.359 rows=25 loops=1)
        Filter: (authorid = ?)
        Rows Removed by Filter: 76736945
Planning time: 0.111 ms
Execution time: 210766.403 ms

如何按用户排序检索帖子?在SQL中是否有任何实用的方法可以根据authorid对authorids的索引进行排序?这个功能对我正在做的事情很重要,此时SQL数据库似乎不是最好的选择。

1 个答案:

答案 0 :(得分:1)

对于此查询:

SELECT *
FROM forum.posts
WHERE authorid = ?
ORDER BY postid
LIMIT ?

我会在(authorid, postid)推荐二级索引。这应该可以防止排序。