Question

我观察到来自表的COUNT(*)在深度SQL方面不是优化查询。

这是我正在使用的SQL

SELECT COUNT(*) FROM "items"
INNER JOIN (
  SELECT c.* FROM companies c LEFT OUTER JOIN company_groups ON c.id = company_groups.company_id
  WHERE company_groups.has_restriction IS NULL OR company_groups.has_restriction = 'f' OR company_groups.company_id = 1999 OR company_groups.group_id IN ('3','2')
  GROUP BY c.id
) AS companies ON companies.id = stock_items.vendor_id
LEFT OUTER JOIN favs ON items.id = favs.item_id AND favs.user_id = 999 AND favs.is_visible = TRUE
WHERE "items"."type" IN ('Fashion') AND "items"."visibility" = 't' AND "items"."is_hidden" = 'f' AND (items.depth IS NULL OR (items.depth >= '0' AND items.depth <= '100')) AND (items.table IS NULL OR (items.table >= '0' AND items.table <= '100')) AND (items.company_id NOT IN (199,200,201))

此查询从4084.8ms开始计算数据库中的0.35百万条记录。

我正在使用 Rails 作为框架，因此每当我调用COUNT

时，我正在编写的SQL会触发原始查询的results.count查询

因为，我正在使用LIMIT和OFFSET，因此基本结果的加载时间不到32.0毫秒（这太快了）

以下是EXPLAIN ANALYSE

的输出

Merge Join  (cost=70743.22..184962.02 rows=7540499 width=4) (actual time=4018.351..4296.963 rows=360323 loops=1)
  Merge Cond: (c.id = items.company_id)
  ->  Group  (cost=0.56..216.21 rows=4515 width=4) (actual time=0.357..5.165 rows=4501 loops=1)
        Group Key: c.id
        ->  Merge Left Join  (cost=0.56..204.92 rows=4515 width=4) (actual time=0.303..2.590 rows=4504 loops=1)
              Merge Cond: (c.id = company_groups.company_id)
              Filter: ((company_groups.has_restriction IS NULL) OR (NOT company_groups.has_restriction) OR (company_groups.company_id = 1999) OR (company_groups.group_id = ANY ('{3,2}'::integer[])))
              Rows Removed by Filter: 10
              ->  Index Only Scan using companies_pkey on companies c  (cost=0.28..128.10 rows=4521 width=4) (actual time=0.155..0.941 rows=4508 loops=1)
                    Heap Fetches: 3
              ->  Index Scan using index_company_groups_on_company_id on company_groups  (cost=0.28..50.14 rows=879 width=9) (actual time=0.141..0.480 rows=878 loops=1)
  ->  Materialize  (cost=70742.66..72421.11 rows=335690 width=8) (actual time=4017.964..4216.381 rows=362180 loops=1)
        ->  Sort  (cost=70742.66..71581.89 rows=335690 width=8) (actual time=4017.955..4140.168 rows=362180 loops=1)
              Sort Key: items.company_id
              Sort Method: external merge  Disk: 6352kB
              ->  Hash Left Join  (cost=1.05..35339.74 rows=335690 width=8) (actual time=0.617..3588.634 rows=362180 loops=1)
                    Hash Cond: (items.id = favs.item_id)
                    ->  Seq Scan on items  (cost=0.00..34079.84 rows=335690 width=8) (actual time=0.504..3447.355 rows=362180 loops=1)
                          Filter: (visibility AND (NOT is_hidden) AND ((type)::text = 'Fashion'::text) AND (company_id <> ALL ('{199,200,201}'::integer[])) AND ((depth IS NULL) OR ((depth >= '0'::numeric) AND (depth <= '100'::nume (...)
                          Rows Removed by Filter: 5814
                    ->  Hash  (cost=1.04..1.04 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 8kB
                          ->  Seq Scan on favs  (cost=0.00..1.04 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=1)
                                Filter: (is_visible AND (user_id = 999))
                                Rows Removed by Filter: 3
Planning time: 3.526 ms
Execution time: 4397.849 ms

请告知我应该如何让它更快地运作！

P.S。：所有列都被编入索引，如type，visibility，is_hidden，table，depth等。

提前致谢！

Answer 1

嗯，你有两个部分可以选择查询中的所有内容（SELECT *），也许你可以限制它，看看它是否有帮助，例如：

SELECT COUNT(OneSpecificColumn)
FROM "items"
INNER JOIN
  ( SELECT c.(AnotherSpecificColumn)
   FROM companies c
   LEFT OUTER JOIN company_groups ON c.id = company_groups.company_id
   WHERE company_groups.has_restriction IS NULL
     OR company_groups.has_restriction = 'f'
     OR company_groups.company_id = 1999
     OR company_groups.group_id IN ('3',
                                    '2')
   GROUP BY c.id) AS companies ON companies.id = stock_items.vendor_id
LEFT OUTER JOIN favs ON items.id = favs.item_id
AND favs.user_id = 999
AND favs.is_visible = TRUE
WHERE "items"."type" IN ('Fashion')
  AND "items"."visibility" = 't'
  AND "items"."is_hidden" = 'f'
  AND (items.depth IS NULL
       OR (items.depth >= '0'
           AND items.depth <= '100'))
  AND (items.table IS NULL
       OR (items.table >= '0'
           AND items.table <= '100'))
  AND (items.company_id NOT IN (199,
                                200,
                                201))

您还可以检查这些左连接是否都是必需的，内部连接成本更低，并且可以加快搜索速度。

Answer 2

大部分时间用于items的顺序扫描，而且无法改进，因为您几乎需要表格中的所有行。

因此，改善查询的唯一方法是

看到items缓存在内存中
加快存储速度

COUNT *使用postgresql太慢了

2 个答案: