Rails - 加入后区分ON

时间:2015-09-25 05:09:41

标签: sql ruby-on-rails postgresql greatest-n-per-group ruby-on-rails-4.2

我正在使用Rails 4.2和PostgreSQL。我有Product模型和Purchase模型Product has many Purchases。我想找到最近购买的不同产品。最初我尝试过:

Product.joins(:purchases)
.select("DISTINCT products.*, purchases.updated_at") #postgresql requires order column in select
.order("purchases.updated_at DESC")

然而,这会导致重复,因为它会尝试查找对(product.idpurchases.updated_at)具有唯一值的所有元组。但是,我只想在加入后选择具有不同id的产品。如果产品ID在联接中多次出现,则只选择第一个。所以我也试过了:

Product.joins(:purchases)
.select("DISTINCT ON (product.id) purchases.updated_at, products.*")
.order("product.id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause

这不起作用,因为我需要在product.id子句中指定order,因为this约束会输出意外顺序。

实现这一目标的轨道方式是什么?

5 个答案:

答案 0 :(得分:3)

使用子查询并在外部ORDER BY中添加不同的SELECT子句:

SELECT *
FROM  (
   SELECT DISTINCT ON (pr.id)
          pu.updated_at, pr.*
   FROM   Product pr
   JOIN   Purchases pu ON pu.product_id = pr.id  -- guessing
   ORDER  BY pr.id, pu.updated_at DESC NULLS LAST
   ) sub
ORDER  BY updated_at DESC NULLS LAST;

DISTINCT ON的详细信息:

或其他一些查询技巧:

但是,如果Purchases所需要的只是updated_at,那么在加入之前,您可以通过子查询中的简单聚合来降低成本:

SELECT *
FROM   Product pr
JOIN  (
   SELECT product_id, max(updated_at) AS updated_at
   FROM   Purchases 
   GROUP  BY 1
   ) pu ON pu.product_id = pr.id  -- guessing
ORDER  BY pu.updated_at DESC NULLS LAST;

关于NULLS LAST

甚至更简单,但检索所有行时速度不快:

SELECT pr.*, max(updated_at) AS updated_at
FROM   Product pr
JOIN   Purchases pu ON pu.product_id = pr.id
GROUP  BY pr.id  -- must be primary key
ORDER  BY 2 DESC NULLS LAST;
需要将

Product.id定义为此工作的主键。详细说明:

如果您只提取一小部分(例如,WHERE子句仅限于一个或几个pr.id),这将更快。

答案 1 :(得分:2)

要建立erwin-brandstetter的答案,这就是你如何使用ActiveRecord做到这一点(至少应该关闭):

Product
  .select('*')
  .joins('INNER JOIN (SELECT product_id, max(updated_at) AS updated_at FROM Purchases GROUP  BY 1) pu ON pu.product_id = pr.id')
  .order('pu.updated_at DESC NULLS LAST')

答案 2 :(得分:2)

所以在@ErwinBrandstetter回答的基础上,我终于找到了正确的方法。查找不同最近购买的查询是

SELECT *
FROM  (
   SELECT DISTINCT ON (pr.id)
          pu.updated_at, pr.*
   FROM   Product pr
   JOIN   Purchases pu ON pu.product_id = pr.id
   ) sub
ORDER  BY updated_at DESC NULLS LAST;

子查询中不需要order_by,因为我们无论如何都要在外部查询中进行排序。

执行此操作的轨道方式是 -

inner_query = Product.joins(:purchases)
  .select("DISTINCT ON (products.id) products.*, purchases.updated_at as date") #This selects all the unique purchased products.

result = Product.from("(#{inner_query.to_sql}) as unique_purchases")
  .select("unique_purchases.*").order("unique_purchases.date DESC")

按照@ErwinBrandstetter建议的第二种(也是更好的)方法是

SELECT *
FROM   Product pr
JOIN  (
   SELECT product_id, max(updated_at) AS updated_at
   FROM   Purchases 
   GROUP  BY 1
   ) pu ON pu.product_id = pr.id
ORDER  BY pu.updated_at DESC NULLS LAST;

可以用rails编写

join_query = Purchase.select("product_id, max(updated_at) as date")
  .group(1) #This selects most recent date for all purchased products

result = Product.joins("INNER JOIN (#{join_query.to_sql}) as unique_purchases ON products.id = unique_purchases.product_id")
  .order("unique_purchases.date")

答案 3 :(得分:0)

我最终得到了这个 -

Product.joins(:purchases)
.select("DISTINCT ON (products.id) products.*, purchases.updated_at as date")
.sort_by(&:date)
.reverse

仍在寻找更好的方法。

答案 4 :(得分:-1)

尝试这样做:

Product.joins(:purchases)
.select("DISTINCT ON (products_id) purchases.product_id, purchases.updated_at, products.*")
.order("product_id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause