索引未在加入时使用

时间:2016-06-19 17:48:28

标签: sql postgresql query-optimization

我有这个查询,但出于某种原因(据我所知),trackings.visitor_id上的索引没有被使用,导致缓慢(50ms)的嵌套循环。我知道COUNT表现不是最好的,但它仍然是一个相对较小的数据集,所以我认为50毫秒太多了。

我刚刚开始诊断查询性能,所以请原谅我的无知

SELECT v0."id"
    ,v0."app_id"
    ,v0."first_seen"
    ,v0."last_seen"
    ,v0."last_heard_from"
    ,v0."last_contacted"
    ,v0."last_page_viewed"
    ,v0."sessions"
    ,v0."seen_welcome_message"
    ,v0."email"
    ,v0."data"
    ,v0."name"
    ,v0."type"
    ,v0."avatar"
    ,c1."id"
    ,c1."last_message"
    ,c1."last_activity"
    ,c1."visitor_id"
    ,c1."app_id"
    ,c1."inserted_at"
    ,c1."updated_at"
    ,(
        SELECT array_to_json(array_agg(t))
        FROM trackings t
        WHERE t.visitor_id = v0."id"
        )
    ,count(v0."id") OVER ()
FROM "visitors" AS v0
INNER JOIN "conversations" AS c1 ON c1."visitor_id" = v0."id"
INNER JOIN "apps" AS a2 ON a2."id" = c1."app_id"
LEFT JOIN "trackings" AS t3 ON (t3."visitor_id" = v0."id")
    AND (t3."visitor_id" = v0."id")
INNER JOIN "accounts" AS a4 ON a4."id" = a2."account_id"
WHERE (a2."id" = $1)
    AND (
        (v0."name" ILIKE $2)
        OR (v0."email" ILIKE $3)
        )
    AND (first_seen <= NOW() - ($4 || ' days')::interval)
    AND (last_seen > NOW() - ($5 || ' days')::interval)
GROUP BY v0."id"
    ,c1."id"
ORDER BY v0."last_seen" DESC LIMIT $6

EXPLAIN的结果

Limit  (cost=10925.55..10925.62 rows=30 width=612) (actual time=94.938..94.946 rows=30 loops=1)
2     ->  Sort  (cost=10925.55..10926.85 rows=521 width=612) (actual time=94.936..94.939 rows=30 loops=1)
3           Sort Key: v0.last_seen
4           Sort Method: top-N heapsort  Memory: 50kB
5           ->  WindowAgg  (cost=4766.27..10910.16 rows=521 width=612) (actual time=72.447..94.439 rows=386 loops=1)
6                 ->  HashAggregate  (cost=4766.27..7837.56 rows=521 width=612) (actual time=71.757..71.986 rows=386 loops=1)
7                       Group Key: v0.id, c1.id
8                       ->  Hash Left Join  (cost=1759.72..4763.66 rows=521 width=612) (actual time=21.209..71.118 rows=488 loops=1)
9                             Hash Cond: ((v0.id)::text = (t3.visitor_id)::text)
10                            ->  Nested Loop  (cost=1751.56..4752.22 rows=521 width=612) (actual time=21.075..70.715 rows=386 loops=1)
11                                  ->  Nested Loop  (cost=0.00..13.26 rows=1 width=17) (actual time=0.025..0.108 rows=1 loops=1)
12                                        Join Filter: ((a2.account_id)::text = (a4.id)::text)
13                                        Rows Removed by Join Filter: 134
14                                        ->  Seq Scan on apps a2  (cost=0.00..8.22 rows=1 width=41) (actual time=0.010..0.057 rows=1 loops=1)
15                                              Filter: ((id)::text = 'ASnYW1-RgCl0I'::text)
16                                              Rows Removed by Filter: 257
17                                        ->  Seq Scan on accounts a4  (cost=0.00..3.35 rows=135 width=24) (actual time=0.005..0.028 rows=135 loops=1)
18                                  ->  Hash Join  (cost=1751.56..4733.75 rows=521 width=612) (actual time=21.041..70.491 rows=386 loops=1)
19                                        Hash Cond: ((v0.id)::text = (c1.visitor_id)::text)
20                                        ->  Bitmap Heap Scan on visitors v0  (cost=365.51..3275.51 rows=17860 width=456) (actual time=8.500..52.584 rows=18693 loops=1)
21                                              Recheck Cond: (last_seen > (now() - ('14 days'::cstring)::interval))
22                                              Filter: ((((name)::text ~~* '%%'::text) OR ((email)::text ~~* '%%'::text)) AND (first_seen <= (now() - ('1 days'::cstring)::interval)))
23                                              Rows Removed by Filter: 304
24                                              Heap Blocks: exact=1285
25                                              ->  Bitmap Index Scan on "visitors_last_seen_DESC_NULLS_LAST_index"  (cost=0.00..361.05 rows=19300 width=0) (actual time=8.257..8.257 rows=18997 loops=1)
26                                                    Index Cond: (last_seen > (now() - ('14 days'::cstring)::interval))
27                                        ->  Hash  (cost=1369.60..1369.60 rows=1316 width=156) (actual time=12.497..12.497 rows=1306 loops=1)
28                                              Buckets: 1024  Batches: 1  Memory Usage: 203kB
29                                              ->  Seq Scan on conversations c1  (cost=0.00..1369.60 rows=1316 width=156) (actual time=0.009..11.675 rows=1306 loops=1)
30                                                    Filter: ((app_id)::text = 'ASnYW1-RgCl0I'::text)
31                                                    Rows Removed by Filter: 42822
32                            ->  Hash  (cost=5.29..5.29 rows=229 width=24) (actual time=0.123..0.123 rows=229 loops=1)
33                                  Buckets: 1024  Batches: 1  Memory Usage: 13kB
34                                  ->  Seq Scan on trackings t3  (cost=0.00..5.29 rows=229 width=24) (actual time=0.004..0.037 rows=229 loops=1)
35                SubPlan 1
36                  ->  Aggregate  (cost=5.87..5.89 rows=1 width=76) (actual time=0.055..0.056 rows=1 loops=386)
37                        ->  Seq Scan on trackings t  (cost=0.00..5.86 rows=3 width=76) (actual time=0.048..0.052 rows=0 loops=386)
38                              Filter: ((visitor_id)::text = (v0.id)::text)
39                              Rows Removed by Filter: 229
40  Planning time: 2.629 ms
41  Execution time: 95.170 ms

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

访问者和会话之间的联接在此查询中产生了大部分费用。

您加入了18693名访客,进行了1306次对话。 Postgres决定不再检查visitor.id索引1306次,而是从访问者那里获得所有18693个有趣的行并使用散列连接会更快。这可能是错的。

您可以调整random_page_cost closer to seq_page_cost,推送Postgres以获得更积极的索引使用。