Question

我有三个模型：Catalog，Product和Value。 Value表中有一个characteristic_id列，我想在一组characteristic_id上获得不同values的列表。

这些关系是：

一个catalog有很多products
一个product有很多values

这是我想出的查询：

Value.joins(:product).select(:characteristic_id).distinct.where(products: {catalog_id: catalog.id}).pluck(:characteristic_id)
=> [441, 2582, 3133]

可以返回正确的结果，但是对于具有一百万个产品的大型目录（大约50秒）来说，它非常慢。我找不到更有效的方法来做到这一点。

这是查询的EXPLAIN：

=> EXPLAIN for: SELECT DISTINCT "values"."characteristic_id" FROM "values" INNER JOIN "products" ON "products"."id" = "values"."product_id" WHERE "products"."catalog_id" = $1 [["catalog_id", 1767]]
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1515106.82..1515109.15 rows=233 width=4)
   Group Key: "values".characteristic_id
   ->  Hash Join  (cost=124703.76..1492245.65 rows=9144469 width=4)
         Hash Cond: ("values".product_id = products.id)
         ->  Seq Scan on "values"  (cost=0.00..1002863.07 rows=34695107 width=8)
         ->  Hash  (cost=114002.20..114002.20 rows=652285 width=4)
               ->  Bitmap Heap Scan on products  (cost=12311.64..114002.20 rows=652285 width=4)
                     Recheck Cond: (catalog_id = 1767)
                     ->  Bitmap Index Scan on index_products_on_catalog_id  (cost=0.00..12148.57 rows=652285 width=0)
                           Index Cond: (catalog_id = 1767)
(10 rows)

关于如何更快运行此查询的任何想法？

Answer 1

确保两个外键上都有索引：

索引/\ |
索引"values"."product_id"

Answer 2

尝试在values.characteristic_id上添加索引。
通常GROUP BY比DISTINCT快：

Value.joins（：product）.where（产品：{catalog_id：catalog.id}）。select（：characteristic_id）.group（：characteristic_id）.pluck（：characteristic_id）

有效选择并在大型关联上与众不同

2 个答案: