PostgreSQL DISTINCT ON与不同的ORDER BY

时间:2012-03-20 21:59:37

标签: sql postgresql sql-order-by distinct-on

我想运行此查询:

SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY purchases.purchased_at DESC

但是我收到了这个错误:

  

PG ::错误:错误:SELECT DISTINCT ON表达式必须匹配初始ORDER BY表达式

添加address_id作为第一个ORDER BY表达式会使错误无效,但我真的不想在address_id上添加排序。是否可以不通过address_id订购?

8 个答案:

答案 0 :(得分:158)

文档说:

  

DISTINCT ON(expression [,...])仅保留给定表达式求值的每组行的第一行。 [...]请注意,除非使用ORDER BY确保首先显示所需的行,否则每个集合的“第一行”都是不可预测的。 [...] DISTINCT ON表达式必须与最左边的ORDER BY表达式匹配。

Official documentation

因此,您必须将address_id添加到订单中。

或者,如果您要查找包含每个address_id最新购买产品的完整行,并且该结果按purchased_at排序,则您尝试解决每个组中最大的N可以通过以下方法解决的问题:

应该适用于大多数DBMS的一般解决方案:

SELECT t1.* FROM purchases t1
JOIN (
    SELECT address_id, max(purchased_at) max_purchased_at
    FROM purchases
    WHERE product_id = 1
    GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC

基于@hkf答案的更加面向PostgreSQL的解决方案:

SELECT * FROM (
  SELECT DISTINCT ON (address_id) *
  FROM purchases 
  WHERE product_id = 1
  ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC

问题在此澄清,扩展和解决:Selecting rows ordered by some column and distinct on another

答案 1 :(得分:48)

您可以在子查询中按address_id排序,然后按外部查询中的内容排序。

SELECT * FROM 
    (SELECT DISTINCT ON (address_id) purchases.address_id, purchases.* 
    FROM "purchases" 
    WHERE "purchases"."product_id" = 1 ORDER BY address_id DESC ) 
ORDER BY purchased_at DESC

答案 2 :(得分:36)

子查询可以解决它:

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ) p
ORDER  BY purchased_at DESC;

ORDER BY中的主要表达必须同意DISTINCT ON中的列,因此您无法按同一SELECT中的不同列进行排序。

如果要从每个集合中选择特定行,则仅在子查询中使用其他ORDER BY

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ORDER  BY address_id, purchased_at DESC  -- get "latest" row per address_id
    ) p
ORDER  BY purchased_at DESC;

如果purchased_at可以是NULL,请考虑DESC NULLS LAST 相关,有更多解释:

答案 3 :(得分:10)

窗函数可以一次解决:

SELECT DISTINCT ON (address_id) 
   LAST_VALUE(purchases.address_id) OVER wnd AS address_id
FROM "purchases"
WHERE "purchases"."product_id" = 1
WINDOW wnd AS (
   PARTITION BY address_id ORDER BY purchases.purchased_at DESC
   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

答案 4 :(得分:4)

对于任何使用Flask-SQLAlchemy的人来说,这对我有用

from app import db
from app.models import Purchases
from sqlalchemy.orm import aliased
from sqlalchemy import desc

stmt = Purchases.query.distinct(Purchases.address_id).subquery('purchases')
alias = aliased(Purchases, stmt)
distinct = db.session.query(alias)
distinct.order_by(desc(alias.purchased_at))

答案 5 :(得分:0)

SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY address_id, purchases.purchased_at DESC

ORDER BY address_id ,以DESC的价格购买。

address_id必须按顺序添加,以用于DISTINCT ON()函数

答案 6 :(得分:0)

也可以使用以下查询以及其他答案来解决该问题。

WITH purchase_data AS (
        SELECT address_id, purchased_at, product_id,
                row_number() OVER (PARTITION BY address_id ORDER BY purchased_at DESC) AS row_number
        FROM purchases
        WHERE product_id = 1)
SELECT address_id, purchased_at, product_id
FROM purchase_data where row_number = 1

答案 7 :(得分:-2)

您也可以使用group by子句

来完成此操作
   SELECT purchases.address_id, purchases.* FROM "purchases"
    WHERE "purchases"."product_id" = 1 GROUP BY address_id,
purchases.purchased_at ORDER purchases.purchased_at DESC