如何使用FILTER子句进行查询运行得更快?

时间:2017-03-31 08:25:02

标签: sql postgresql query-optimization

我有以下SQL需要5.6到6秒才能运行

SELECT COUNT(DISTINCT keyword_id) FILTER (WHERE rank>50 OR rank is null) AS "50+",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 21 AND 50) AS "21-50",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 11 AND 20) AS "11-20",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 4 AND 10) AS "4-10",
  COUNT(DISTINCT keyword_id) FILTER (WHERE rank BETWEEN 1 AND 3) AS "1-3",
  date_trunc('month', rank_date) AS date
  FROM keyword_ranks, keywords
  WHERE keywords.deleted_at IS NULL
  AND keywords.id=keyword_ranks.keyword_id 
  AND keywords.business_id=27 GROUP BY date_trunc('month', rank_date);
你可以帮我优化这个查询,以便它运行得更快吗?谢谢

修改

keyword_ranks 架构

  create_table "keyword_ranks", force: :cascade do |t|
    t.integer  "rank"
    t.integer  "charge"
    t.text     "dom"
    t.integer  "keyword_id"
    t.datetime "created_at",                   null: false
    t.datetime "updated_at",                   null: false
    t.string   "volume"
    t.string   "cmc"
    t.integer  "all_time_service_change"
    t.integer  "all_time_cost_change"
    t.integer  "this_month_service_change"
    t.integer  "this_month_cost_change"
    t.integer  "last_30_days_service_change"
    t.integer  "last_30_days_cost_change"
    t.integer  "last_sevendays_service_change"
    t.integer  "last_sevendays_cost_change"
    t.datetime "service_callback_updated_at"
    t.datetime "cost_callback_updated_at"
    t.json     "service_raw_data"
    t.json     "cost_raw_data"
  end

  add_index "keyword_ranks", ["keyword_id"], name: "index_keyword_ranks_on_keyword_id", using: :btree
  add_index "keyword_ranks", ["rank_date"], name: "rank_date_index", using: :btree

执行计划

"        Sort Method: external sort  Disk: 1808kB"
"        ->  Hash Join  (cost=93.73..158335.35 rows=110023 width=12) (actual time=2.546..5605.758 rows=99149 loops=1)"
"              Output: keyword_ranks.rank_date, keyword_ranks.keyword_id, keyword_ranks.rank"
"              Hash Cond: (keyword_ranks.keyword_id = keywords.id)"
"              ->  Seq Scan on public.keyword_ranks  (cost=0.00..151022.92 rows=1631592 width=12) (actual time=0.236..5177.327 rows=1631592 loops=1)"
"                    Output: keyword_ranks.id, keyword_ranks.rank, keyword_ranks.charge, keyword_ranks.dom, keyword_ranks.rank_date, key (...)"
"              ->  Hash  (cost=84.78..84.78 rows=716 width=4) (actual time=2.200..2.200 rows=714 loops=1)"
"                    Output: keywords.id"
"                    Buckets: 1024  Batches: 1  Memory Usage: 34kB"
"                    ->  Bitmap Heap Scan on public.keywords  (cost=17.83..84.78 rows=716 width=4) (actual time=1.218..2.080 rows=714 loops=1)"
"                          Output: keywords.id"
"                          Recheck Cond: ((keywords.business_id = 27) AND (keywords.deleted_at IS NULL))"
"                          Heap Blocks: exact=41"
"                          ->  Bitmap Index Scan on business_id_index  (cost=0.00..17.66 rows=716 width=0) (actual time=0.767..0.767 rows=714 loops=1)"
"                                Index Cond: (keywords.business_id = 27)"

1 个答案:

答案 0 :(得分:1)

查询看起来很好。

我建议给它一个部分索引:

CREATE INDEX idx ON keywords (business_id) WHERE deleted_at IS NULL;