与 - PostgreSQL DISTINCT ON with different ORDER BY
相关我购买了桌子(product_id,purchase_at,address_id)
示例数据:
| id | product_id | purchased_at | address_id |
| 1 | 2 | 20 Mar 2012 21:01 | 1 |
| 2 | 2 | 20 Mar 2012 21:33 | 1 |
| 3 | 2 | 20 Mar 2012 21:39 | 2 |
| 4 | 2 | 20 Mar 2012 21:48 | 2 |
我期望的结果是每个address_id最近购买的产品(完整行),结果必须按purchase_at字段的后续顺序排序:
| id | product_id | purchased_at | address_id |
| 4 | 2 | 20 Mar 2012 21:48 | 2 |
| 2 | 2 | 20 Mar 2012 21:33 | 1 |
使用查询:
SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM "purchases"
WHERE "purchases"."product_id" = 2
ORDER BY purchases.address_id ASC, purchases.purchased_at DESC
我得到了:
| id | product_id | purchased_at | address_id |
| 2 | 2 | 20 Mar 2012 21:33 | 1 |
| 4 | 2 | 20 Mar 2012 21:48 | 2 |
所以行是相同的,但顺序是错误的。有什么办法解决吗?
答案 0 :(得分:16)
一个明确的问题:)
SELECT t1.* FROM purchases t1
LEFT JOIN purchases t2
ON t1.address_id = t2.address_id AND t1.purchased_at < t2.purchased_at
WHERE t2.purchased_at IS NULL
ORDER BY t1.purchased_at DESC
而且很可能是一种更快的方法:
SELECT t1.* FROM purchases t1
JOIN (
SELECT address_id, max(purchased_at) max_purchased_at
FROM purchases
GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC
答案 1 :(得分:8)
DISTINCT ON使用您的ORDER BY来选择要生成的每个不同address_id的哪一行。如果您想要对结果记录进行排序,请将DISTINCT设置为子选择并对其结果进行排序:
SELECT * FROM
(
SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM "purchases"
WHERE "purchases"."product_id" = 2
ORDER BY purchases.address_id ASC, purchases.purchased_at DESC
) distinct_addrs
order by distinct_addrs.purchased_at DESC
答案 2 :(得分:0)
这个查询比正确看起来要复杂得多。
currently accepted, join-based answer无法正确处理两个候选行具有相同给定purchased_at
值的情况:它将返回两行。
您可以通过这种方式获得正确的行为:
SELECT * FROM purchases AS given
WHERE product_id = 2
AND NOT EXISTS (
SELECT NULL FROM purchases AS other
WHERE given.address_id = other.address_id
AND (given.purchased_at < other.purchased_at OR given.id < other.id)
)
ORDER BY purchased_at DESC
请注意,如果比较id
值以消除purchased_at
值匹配的情况,它是如何回退的。这可以确保条件只能在具有相同address_id
值的行中的单行中成立。
使用DISTINCT ON
的原始查询会自动处理此案例!
另请注意,您必须在address_id
条件和given.purchased_at < other.purchased_at
子句中对您希望“每ORDER BY purchased_at DESC
个最新”两次的事实进行编码,以及你必须确保它们匹配。我不得不花费额外的几分钟来说服自己,这个问题确实是正确的。
根据dbenhur的建议,使用DISTINCT ON
和外部子查询正确且可理解地编写此查询要容易得多。