我有一个名为交易的表格,如下所示:
id trade_date trade_price trade_status seller_name
1 2015-01-02 150 open Alex
2 2015-03-04 500 close John
3 2015-04-02 850 close Otabek
4 2015-05-02 150 close Alex
5 2015-06-02 100 open Otabek
6 2015-07-02 200 open John
我想总结一下trade_price
按seller_name
分组的最后一次(trade_date
)trade_status
是否打开'。那就是:
sum_trade_price seller_name
700 John
950 Otabek
跳过seller_name
Alex 的行,因为上一个trade_status
'关闭' 。
虽然我可以借助嵌套选择
SELECT SUM(t1.trade_price), t1.seller_name
WHERE t1.seller_name NOT IN
(SELECT t2.seller_name FROM trades t2
WHERE t2.seller_name = t1.seller_name AND t2.trade_status = 'close'
ORDER BY t2.trade_date DESC LIMIT 1)
from trades t1
group by t1.seller_name
但执行上述查询需要超过1分钟(我有大约100K行)。 还有另一种处理方法吗? 我正在使用PostgreSQL。
答案 0 :(得分:2)
我会用窗口函数来处理这个问题:
SELECT SUM(t.trade_price), t.seller_name
FROM (SELECT t.*,
FIRST_VALUE(trade_status) OVER (PARTITION BY seller_name ORDER BY trade_date desc) as last_trade_status
FROM trades t
) t
WHERE last_trade_status <> 'close;
GROUP BY t.seller_name;
答案 1 :(得分:2)
这应该与seller_name
select
sum(trade_price) as sum_trade_price,
seller_name
from
trades
inner join
(
select distinct on (seller_name) seller_name, trade_status
from trades
order by seller_name, trade_date desc
) s using (seller_name)
where s.trade_status = 'open'
group by seller_name