我有一个相当复杂的问题,我甚至不确定仅凭presto / sql是否可以完成;所以我将不胜感激。
设置。我有一张订单表...(table1)
order_id | customer_id | order_date | blahblahblah....
--------------------------------------------------
11111 | 5432567 | 2018-12-16 | ..........
10002 | 6543212 | 2019-01-21 | ..........
22222 | 3456788 | 2018-11-09 | ..........
我还有另一个表(table2)
customer_id | customer_rating | as_of_date |
--------------------------------------------
5432567 | A- | 2019-02-04 |
6543212 | B+ | 2019-02-04 |
每天更新一次,我必须通过指定as_of_date
来调用它,类似这样
selct * from table2
where customer_id="6543212"
and as_of_date='2019-02-04' -- or whatever date
现在是问题所在。我想创建一个表,在order_id
中的table1
之前,为customer_rating
中的每个customer_id
选择一个order_date
的{{1}} (例如,当table1
= as_of_date
时)和订单日期之后(今天假设order_date - 1
)。
为了更清楚一点,这是我创建模拟表的尝试
as_of_date =
有什么想法吗?
答案 0 :(得分:0)
您可以使用技巧。使用union all
将两个表放在一起。然后使用窗口函数获取所有行的上一个和下一个评级日期-使用累积的min()
和max()
。
有了这些信息,您就可以使用另一个窗口函数来获取评分,并最终对行进行过滤以仅获取orders
中原来的行:
select ot.*
from (select ot.*,
max(rating) over (partition by customer_id, prev_rating_date) as prev_rating,
max(rating) over (partition by customer_id, next_rating_date) as next_rating
from (select ot.*,
max(case when rating is not null then order_date end) over (partition by customer_id order by orderdate asc) as prev_rating_date,
min(case when rating is not null then order_date end) over (partition by customer_id order by orderdate desc) as next_rating_date,
from ((select order_id, customer_id, order_date, NULL as rating
from orders
) union all
(select NULL, customer_id, as_of_date, rating
from table2
)
) ot
) ot
) ot
where rating is null;
答案 1 :(得分:-2)
如果要组织组合的表,请删除*并单独放置列名称。
尝试一下
SELECT
table1.customer_id,
table1.order_id,
table1.order_date,
table1.customer_rating as customer_rating_before
table2.customer_rating as customer_rating_after
FROM
table1, table2
WHERE
table1.customer_id = table2.customer_id
这组合了2个表,分别是旧表(表1)和新表(表2) 它将从旧表(表1)和新表(表2)创建customer_rating的两列