我在postgresql数据库中有这个表:
purchase
userid | date | price
---------------------------
1 | 2016-01-06 | 10
1 | 2016-01-05 | 5
2 | 2016-01-06 | 12
2 | 2016-01-05 | 15
我想要所有用户的最后购买价格的总和。对于用户1,最后一次购买是在2016-01-06,价格是10.对于用户2,最后一次购买是在2016-01-06,价格是12.所以SQL查询的结果应该是{{1 }}
我如何在SQL中执行此操作?
答案 0 :(得分:4)
您可以使用窗口函数来获取排名,然后使用SUM
的正常汇总:
WITH cte AS
(
SELECT *, RANK() OVER(PARTITION BY userid ORDER BY "date" DESC) AS r
FROM purchase
)
SELECT SUM(price) AS total
FROM cte
WHERE r = 1;
的 SqlFiddleDemo
强>
请记住,此解决方案可计算关联。要为每个用户只购买一次,您需要一个每个组不同的列(例如datetime
)。但仍然有可能获得联系。
修改强>
处理关系:
CREATE TABLE purchase(
userid INTEGER NOT NULL
,date timestamp NOT NULL
,price INTEGER NOT NULL
);
INSERT INTO purchase(userid,date,price) VALUES
(1, timestamp'2016-01-06 12:00:00',10),
(1,timestamp'2016-01-05',5),
(2,timestamp'2016-01-06 13:00:00',12),
(2,timestamp'2016-01-05',15),
(2,timestamp'2016-01-06 13:00:00',1000)'
请注意差异RANK()
与ROW_NUMBER
:
的 SqlFiddleDemo_RANK
强>
的 SqlFiddleDemo_ROW_NUMBER
强>
的 SqlFiddleDemo_ROW_NUMBER_2
强>
输出:
╔════════╦══════════════╦══════════════╗
║ RANK() ║ ROW_NUMBER() ║ ROW_NUMBER() ║
╠════════╬══════════════╬══════════════╣
║ 1022 ║ 22 ║ 1010 ║
╚════════╩══════════════╩══════════════╝
UNIQUE
上没有userid/date
索引,总是有可能(可能很小)为平局。任何基于ORDER BY
的解决方案都必须以稳定的方式工作。
答案 1 :(得分:3)
获得最新的"您可以在Postgres中使用distinct on ()
的价格:
select distinct on (userid) userid, date, price
from the_table
order by userid, date desc
现在您只需要总结上述声明返回的所有价格:
select sum(price)
from (
select distinct on (userid) userid, price
from the_table
order by userid, date desc
) t;
答案 2 :(得分:1)
答案 3 :(得分:1)
所有提议的解决方案都很好并且有效但是由于我的表包含数百万条记录,我必须找到更有效的方法来做我想要的。似乎更好的方法是使用表purchase
和user
之间的外键(在我的问题中我没有提到,我的道歉)purchase.user -> user.id
。知道了这一点,我可以做以下要求:
select sum(t.price) from (
select (select price from purchase p where p.userid = u.id order by date desc limit 1) as price
from user u
) t;
修改强>
要回答@a_horse_with_no_name,我的解决方案是explain analyse verbose
:
他的解决方案:
Aggregate (cost=64032401.30..64032401.31 rows=1 width=4) (actual time=566101.129..566101.129 rows=1 loops=1)
Output: sum(purchase.price)
-> Unique (cost=62532271.89..64032271.89 rows=10353 width=16) (actual time=453849.494..566087.948 rows=12000 loops=1)
Output: purchase.userid, purchase.price, purchase.date
-> Sort (cost=62532271.89..63282271.89 rows=300000000 width=16) (actual time=453849.492..553060.789 rows=300000000 loops=1)
Output: purchase.userid, purchase.price, purchase.date
Sort Key: purchase.userid, purchase.date
Sort Method: external merge Disk: 7620904kB
-> Seq Scan on public.purchase (cost=0.00..4910829.00 rows=300000000 width=16) (actual time=0.457..278058.430 rows=300000000 loops=1)
Output: purchase.userid, purchase.price, purchase.date
Planning time: 0.076 ms
Execution time: 566433.215 ms
我的解决方案:
Aggregate (cost=28366.33..28366.34 rows=1 width=4) (actual time=53914.690..53914.690 rows=1 loops=1)
Output: sum((SubPlan 1))
-> Seq Scan on public.user2 u (cost=0.00..185.00 rows=12000 width=4) (actual time=0.021..3.816 rows=12000 loops=1)
Output: u.id, u.name
SubPlan 1
-> Limit (cost=0.57..2.35 rows=1 width=12) (actual time=4.491..4.491 rows=1 loops=12000)
Output: p.price, p.date
-> Index Scan Backward using purchase_user_date on public.purchase p (cost=0.57..51389.67 rows=28977 width=12) (actual time=4.490..4.490 rows=1 loops=12000)
Output: p.price, p.date
Index Cond: (p.userid = u.id)
Planning time: 0.115 ms
Execution time: 53914.730 ms
我的桌子包含3亿条记录
我不知道它是否相关,但我也有purchase (userid, date)
的索引。