假设我有以下表格
product_prices
product|price|date
-------+-----+----------
apple |10 |2014-03-01
-------+-----+----------
apple |20 |2014-05-02
-------+-----+----------
egg |2 |2014-03-03
-------+-----+----------
egg |4 |2015-10-12
购买:
user|product|date
----+-------+----------
John|apple |2014-03-02
----+-------+----------
John|apple |2014-06-03
----+-------+----------
John|egg |2014-08-13
----+-------+----------
John|egg |2016-08-13
我需要的是与此类似的表:
name|product|purchase date |price date|price
----+-------+--------------+----------+-----
John|apple |2014-03-02 |2014-03-01|10
----+-------+--------------+----------+-----
John|apple |2014-06-03 |2014-05-02|20
----+-------+--------------+----------+-----
John|egg |2014-08-13 |2014-08-13|2
----+-------+--------------+----------+-----
John|egg |2016-08-13 |2015-10-12|4
或“今天产品的价格是多少”。价格是根据products
表中的日期计算得出的。
在实际的数据库上,我尝试使用类似于以下内容的东西:
SELECT name, product, pu.date, pp.date, pp.price
FROM purchases AS pu
LEFT JOIN product_prices AS pp
ON pu.date = (
SELECT date
FROM product_prices
ORDER BY date DESC LIMIT 1);
但是我要么只获得表格的左部分(用(空)代替产品日期和价格),要么用价格和日期的所有组合获得很多行。
答案 0 :(得分:1)
我建议将product_prices
表更改为使用daterange
列(或至少使用start_date
和end_date
)。
您可以使用排除约束来确保您永远不会对一种产品有重叠范围,并使用插入触发器“关闭”“当前”价格并为新插入的价格创建一个新的无边界范围。
daterange
可以有效地建立索引,并且有了它,查询就变得很容易:
SELECT name, product, pu.date, pp.valid_during, pp.price
FROM purchases AS pu
LEFT JOIN product_prices AS pp ON pu.date <@ pp.valid_during
(假设范围列名为valid_during
)
但是,仅当乘积是整数(不是varchar)时,排除约束才有效-但是我想您的真实product_purchases
表无论如何还是要对某些乘积表使用外键(这是整数)。
新的表定义可能类似于:
create table purchase_prices
(
product_id integer not null references products,
price numeric(16,4) not null,
valid_during daterange not null
);
以及防止范围重叠的约束:
alter table purchase_prices
add constraint check_price_range
exclude using gist (product_id with =, valid_during with &&);
约束需要扩展btree_gist。
与往常一样,提高查询速度是有代价的,在这种情况下,这是GiST索引的较高维护成本。您需要运行一些测试,以查看更简单(且可能更快)的查询是否超过purchase_prices
上较慢的插入性能。
答案 1 :(得分:0)
您可以尝试这样的方法,尽管我确信有更好的方法:
with diffs as (
select
a.*,
b."date" as bdate,
b.price,
b."date" - a."date" as diffdays,
row_number() over (
partition by "user", a."product", a."date"
order by "user", a."product", a."date", b."date" - a."date" desc
) as sr
from purchases a
inner join product_prices b on a.product = b.product
where b."date" - a."date" < 1
)
select
"user" as "name",
product,
"date" as "purchase date",
bdate as "price date",
price
from diffs
where sr = 1
示例:https://www.db-fiddle.com/f/dwQ9EXmp1SdpNpxyV1wc6M/0
说明
我试图同时加入两个表,以查找购买日期和价格之间的差异,然后按购买前的最接近日期对其进行排名。排名1将最接近日期。然后,提取等级为1的数据。
答案 2 :(得分:0)
这是使用日期范围的好地方!我们知道价格范围的开始日期,我们可以使用窗口函数来获取下一个日期。在这一点上,很容易确定任何一天的价格。
with price_ranges as
(select product,
price,
date as price_date,
daterange(date, lead(date, 1)
OVER (partition by product order by date), '[)'
) as valid_price_range from product_prices
)
select "user" as name,
purchases.product,
purchases.date,
price_date,
price
from purchases
join price_ranges on purchases.product = price_ranges.product
and purchases.date <@ price_ranges.valid_price_range
order by purchases.date;
答案 3 :(得分:0)
非常仔细地查看标量子查询。它不关联回外部查询。换句话说,它每次都会返回相同的结果:product_prices
表中的最新日期。期。考虑上下文之外的查询:
SELECT date
FROM product_prices
ORDER BY date DESC LIMIT 1
它有两个问题:
2015-10-12
,最终,该日期未购买任何东西,因此为空。product_prices
行,否则您总会错过。 “最近”表示距离和排名。WITH close_prices_by_purchase AS (
SELECT
p.user,
p.product,
p.date pp.date,
pp.price,
row_number() over (partition by pp.product, order by pp.date desc) as distance -- calculate distance between purchase date and price date
FROM purchases AS p
INNER JOIN product_prices AS pp on pp.product = p.product
WHERE pp.date < p.date
)
SELECT user as name, product, pu.date as purchase_date, pp.date as price_date, price
FROM close_prices_by_purchase AS cpbp
WHERE distance = 1; -- shortest distance