我需要在特定日期和时间获得商品的费用。我有这两个表:
create table sales ( product_id int, items_sold int, date_loaded date );
create table product ( product_id int, description string, item_cost double, date_loaded date );
产品表是每个项目的历史记录。如果今天的物品成本是1.00美元,但昨天该物品的成本是0.99美元,我每天会有两个记录。当我加载销售数据时,我需要反映昨天该项目的成本,而不是今天的成本。
以下是我要执行的查询:
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost as total_cost FROM sales s, product p
WHERE
p.product_id = s.product_id and
p.date_loaded <= (
SELECT MAX(pp.date_loaded)
FROM product pp
WHERE
pp.product_id = s.product_id and
pp.date_loaded <= s.date_loaded
)
SALES TABLE:
|PRODUCT_ID |ITEMS_SOLD |DATE_LOADED |
|1 |4 |2016-06-30 |
|1 |5 |2016-07-01 |
|1 |6 |2016-07-02 |
|1 |3 |2016-07-03 |
产品表:
|PRODUCT_ID |DESCRIPTION |ITEM_COST |DATE_LOADED |
|1 |ITEM A |0.99 |2016-06-20 |
|1 |ITEM A |1.00 |2016-07-02 |
我希望看到这个结果:
|PRODUCT_ID |ITEMS_SOLD |DESCRIPTION |ITEM_COST |TOTAL_COST |
|1 |4 |ITEM A |0.99 |3.96 |
|1 |5 |ITEM A |0.99 |4.95 |
|1 |6 |ITEM A |1.00 |6.00 |
|1 |3 |ITEM A |1.00 |3.00 |
从我读过的所有内容都不允许这种形式的子查询。那么我怎样才能在HIVE中实现这一目标呢?
答案 0 :(得分:0)
With result as(select PRODUCT_ID, DESCRIPTION, ITEM_COST , DATE_LOADED ,
LEAD(DATE_LOADED, 1,'2999-01-01')
OVER (ORDER BY DATE_LOADED) AS fromdate from PRODUCT )
SELECT s.product_id, s.items_sold, p.description, s.items_sold * p.item_cost
as total_cost FROM sales s join result p on s.product_id = p.product_id
where s.DATE_LOADED >= p.DATE_LOADED and s.DATE_LOADED < p.fromdate ;