我有一张这样的表:
SELECT * FROM orders;
client_id | order_id | salesman_id | price
-----------+----------+-------------+-------
1 | 167 | 1 | 65
1 | 367 | 1 | 27
2 | 401 | 1 | 29
2 | 490 | 2 | 48
3 | 199 | 1 | 68
3 | 336 | 2 | 22
3 | 443 | 1 | 84
3 | 460 | 2 | 92
我想为每个独特的销售人员和客户端对找到每个最高价销售的order_ids数组。在这种情况下,我想要结果表:
salesman_id | order_id
-------------+----------------
1 | {167, 401, 443}
2 | {490, 460}
到目前为止,我有一个查询大纲:
SELECT salesman_id, max_client_salesman(order_id)
FROM orders
GROUP BY salesman_id;
但是我在编写aggregate function max_client_salesman。
时遇到了麻烦postgres中聚合函数和数组的联机文档非常少。任何帮助表示赞赏。
答案 0 :(得分:2)
我会将window function last_value()
or firstvalue()
和DISTINCT
结合起来,以有效的方式获得每个(salesman_id, client_id)
最高价格的订单,然后将其汇总到您要查找的数组中,使用简单的{{ 3}}
SELECT salesman_id
,array_agg(max_order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT
salesman_id, client_id
,last_value(order_id) OVER (PARTITION BY salesman_id, client_id
ORDER BY price
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS max_order_id
FROM orders
) x
GROUP BY salesman_id
ORDER BY salesman_id;
返回:
salesman_id | most_expensive_orders_per_client
-------------+------------------------------------
1 | {167, 401, 443}
2 | {490, 460}
aggregate function array_agg()
如果每个(salesman_id, client_id)
有多个最高价格,则此查询只会选择一个 order_id
- 因为缺乏定义。
对于此解决方案,必须了解窗口函数是在DISTINCT
之前应用的。如何将DISTINCT
与窗口函数结合使用:
有关ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
的说明,请参阅PostgreSQL: running count of rows for a query 'by minute'。
PostgreSQL实现了SQL标准的扩展DISTINCT ON
。有了它,您可以根据一组定义的列非常有效地选择唯一的行
它不会比这更简单或更快:
SELECT salesman_id
,array_agg(order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT ON (1, client_id)
salesman_id, order_id
FROM orders
ORDER BY salesman_id, client_id, price DESC
) x
GROUP BY 1
ORDER BY 1;
我还使用位置参数来缩短语法。详细说明:
答案 1 :(得分:0)
我认为您希望Postgres功能array_agg
与row_number()
结合使用但是,您对查询的描述对我来说没有意义。
以下内容为销售员提供客户和销售人员以及最高价订单的订单列表:
select client_id, salesman_id, array_agg(order_id)
from (select o.*,
row_number() over (partition by salesman_id order by price desc) as sseqnum,
row_number() over (partition by client_id order by price desc) as cseqnum
from orders o
) o
where sseqnum = 1
group by salesman_id, client_id
我不知道“每个销售人员和客户的最高销售额”是什么意思。也许你想要:
where sseqnum = 1 or cseqnum = 1