Question

我有这个查询，而且非常慢：

SELECT a.id,
COALESCE(uas.read, CAST(0 AS BOOLEAN)) as read,
COALESCE(at.link, '') as thumbnail_link
FROM users_feeds uf INNER JOIN articles a
    ON uf.feed_id = a.feed_id
LEFT OUTER JOIN users_articles_states uas
    ON a.id = uas.article_id AND uf.user_login = uas.user_login
LEFT OUTER JOIN articles_thumbnails at
    ON a.id = at.article_id
WHERE uf.user_login = 'test1'
ORDER BY uas.read, a.date DESC LIMIT 50 OFFSET 0;

使用我当前的数据集平均需要500毫秒。两个最大的表格是＆＃39;文章＆＃39;和＆＃39; users_articles_states＆＃39;，两者各持有大约100000条记录。

如果我放弃“uas.read＆＃39;从ORDER BY开始，查询大约需要2ms。阅读＆＃39;和＆＃39; date＆＃39;这两个表中的列都有索引（我想这可以解释为什么只按日期排序时速度如此之快）

缓慢执行的查询计划如下（吸尘后）：

                                                                     QUERY PLAN                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=27944.57..27944.69 rows=50 width=104) (actual time=321.465..321.471 rows=50 loops=1)
   ->  Sort  (cost=27944.57..28218.93 rows=109747 width=104) (actual time=321.464..321.465 rows=50 loops=1)
         Sort Key: uas.read, a.date
         Sort Method: top-N heapsort  Memory: 34kB
         ->  Hash Left Join  (cost=3863.32..24298.85 rows=109747 width=104) (actual time=45.736..292.656 rows=92297 loops=1)
               Hash Cond: (a.id = at.article_id)
               ->  Hash Left Join  (cost=3668.47..23088.83 rows=109747 width=17) (actual time=44.205..235.573 rows=92297 loops=1)
                     Hash Cond: ((uf.user_login = uas.user_login) AND (a.id = uas.article_id))
                     ->  Hash Join  (cost=1.57..14331.50 rows=109747 width=24) (actual time=0.019..73.701 rows=92297 loops=1)
                           Hash Cond: (a.feed_id = uf.feed_id)
                           ->  Seq Scan on articles a  (cost=0.00..12757.64 rows=94964 width=20) (actual time=0.003..34.462 rows=93916 loops=1)
                           ->  Hash  (cost=1.31..1.31 rows=21 width=12) (actual time=0.011..0.011 rows=21 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                 ->  Seq Scan on users_feeds uf  (cost=0.00..1.31 rows=21 width=12) (actual time=0.003..0.009 rows=21 loops=1)
                                       Filter: (user_login = 'test1'::text)
                                       Rows Removed by Filter: 4
                     ->  Hash  (cost=1741.65..1741.65 rows=92283 width=17) (actual time=44.170..44.170 rows=92282 loops=1)
                           Buckets: 2048  Batches: 8  Memory Usage: 639kB
                           ->  Seq Scan on users_articles_states uas  (cost=0.00..1741.65 rows=92283 width=17) (actual time=0.005..24.293 rows=92282 loops=1)
                                 Filter: (user_login = 'test1'::text)
                                 Rows Removed by Filter: 10
               ->  Hash  (cost=135.49..135.49 rows=4749 width=95) (actual time=1.520..1.520 rows=4733 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 606kB
                     ->  Seq Scan on articles_thumbnails at  (cost=0.00..135.49 rows=4749 width=95) (actual time=0.004..0.765 rows=4733 loops=1)

＆＃39;阅读＆＃39;是：＆＃34; users_articles_states_read_idx＆＃34; btree（阅读）

我猜测psql无法使用此索引。是否有我可以创建的其他索引以便相对快速地获取内容，或者我可以通过任何其他方式更改查询本身来安抚数据库？

编辑1：我错误地发布了原始查询并显示错误（在＆＃39; uas＆＃39;表格中的INNER JOIN）

表定义：

readeef=> \d users_articles_states
 Table "public.users_articles_states"
   Column   |  Type   |   Modifiers   
------------+---------+---------------
 user_login | text    | not null
 article_id | bigint  | not null
 read       | boolean | default false
 favorite   | boolean | default false
Indexes:
    "users_articles_states_pkey" PRIMARY KEY, btree (user_login, article_id)
    "users_articles_states_read_idx" btree (read)
Foreign-key constraints:
    "users_articles_states_article_id_fkey" FOREIGN KEY (article_id) REFERENCES articles(id) ON DELETE CASCADE
    "users_articles_states_user_login_fkey" FOREIGN KEY (user_login) REFERENCES users(login) ON DELETE CASCADE


readeef=> \d articles             
                                    Table "public.articles"
   Column    |           Type           |                       Modifiers                       
-------------+--------------------------+-------------------------------------------------------
 id          | bigint                   | not null default nextval('articles_id_seq'::regclass)
 feed_id     | integer                  | 
 link        | text                     | 
 title       | text                     | 
 description | text                     | 
 date        | timestamp with time zone | 
 guid        | text                     | 
Indexes:
    "articles_pkey" PRIMARY KEY, btree (id)
    "articles_feed_id_guid_key" UNIQUE CONSTRAINT, btree (feed_id, guid)
    "articles_feed_id_link_key" UNIQUE CONSTRAINT, btree (feed_id, link)
    "articles_date_idx" btree (date)
Foreign-key constraints:
    "articles_feed_id_fkey" FOREIGN KEY (feed_id) REFERENCES feeds(id) ON DELETE CASCADE
Referenced by:
    TABLE "articles_extracts" CONSTRAINT "articles_extracts_article_id_fkey" FOREIGN KEY (article_id) REFERENCES articles(id) ON DELETE CASCADE
    TABLE "articles_scores" CONSTRAINT "articles_scores_article_id_fkey" FOREIGN KEY (article_id) REFERENCES articles(id) ON DELETE CASCADE
    TABLE "articles_thumbnails" CONSTRAINT "articles_thumbnails_article_id_fkey" FOREIGN KEY (article_id) REFERENCES articles(id) ON DELETE CASCADE
    TABLE "users_articles_states" CONSTRAINT "users_articles_states_article_id_fkey" FOREIGN KEY (article_id) REFERENCES articles(id) ON DELETE CASCADE

编辑2：将索引添加到＆＃39; user_login＆＃39;不会删除Seq Scan，可能是因为只有少数用户＆＃39;在数据库中。

编辑3：忘了提，psql版本是9.3.9

编辑4：我尝试了一些不同的东西。我删除了“uas.read”＃39;从ORDER BY子句中添加＆＃34;和uas.read =＆＃39; t＆＃39;＆＃34;到哪一个。根据规划者，执行时间为0.4ms。将后来更改为＆＃34;和uas.read =＆＃39; f＆＃39;＆＃34;，执行时间跳到622ms。两个执行计划之间几乎没有区别，除了成本，过滤器（一个未读取，另一个读取），以及通过连接过滤器删除的＆＃39;：

                                                                                QUERY PLAN (slow query)                                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.99..2374.08 rows=50 width=103) (actual time=0.064..671.332 rows=2 loops=1)
   ->  Nested Loop Left Join  (cost=0.99..50263.02 rows=1059 width=103) (actual time=0.063..671.330 rows=2 loops=1)
         ->  Nested Loop  (cost=0.71..49930.39 rows=1059 width=17) (actual time=0.057..671.319 rows=2 loops=1)
               ->  Nested Loop  (cost=0.29..47995.18 rows=3744 width=48) (actual time=0.039..363.354 rows=92052 loops=1)
                     Join Filter: (uf.feed_id = a.feed_id)
                     Rows Removed by Join Filter: 1873485 (1207 in fast one)
                     ->  Index Scan Backward using articles_date_idx on articles a  (cost=0.29..46589.91 rows=93597 width=20) (actual time=0.011..58.529 rows=93597 loops=1) (rows=60 in fast one)
                     ->  Materialize  (cost=0.00..1.32 rows=1 width=36) (actual time=0.000..0.001 rows=21 loops=93597) (loops=60 in fast one)
                           ->  Seq Scan on users_feeds uf  (cost=0.00..1.31 rows=1 width=36) (actual time=0.006..0.013 rows=21 loops=1)
                                 Filter: (user_login = 'test1'::text)
                                 Rows Removed by Filter: 4
               ->  Index Scan using users_articles_states_pkey on users_articles_states uas  (cost=0.42..0.51 rows=1 width=17) (actual time=0.003..0.003 rows=0 loops=92052)
                 Index Cond: ((user_login = 'test1'::text) AND (article_id = a.id))
                 Filter: (NOT read) (read in fast one)
                 Rows Removed by Filter: 1
     ->  Index Scan using articles_thumbnails_pkey on articles_thumbnails at  (cost=0.28..0.30 rows=1 width=94) (actual time=0.002..0.004 rows=1 loops=2)
           Index Cond: (a.id = article_id)

在sqlite3中使用相同的数据和类似方案进行测试后，在使用＆＃39; uas.read＆＃39;进行排序时速度很慢，但在WHERE子句中对它进行过滤没有问题。它的执行时间是相同的~0.5ms，无论其是否为＆＃39;而不是uas.read＆＃39;或者＆＃39;和uas.read＆＃39;

通过不同表上的列排序来优化查询

0 个答案: