如何根据另一个表中的值获取一个表的最大值

时间:2017-03-27 22:06:26

标签: sql database hadoop hive

我有两个数据表

表y在yyyy-mm-dd和区域(仅西,东可能)

effect_date  region
2013-01-01   west
2013-01-02   west
2013-01-10   west
2015-01-01   west
2015-01-02   west
2015-01-10   west

表格yyyy-mm-dd(从2015-01-01开始到当前日期无日期间隙)

run_date
01-01-2015 west
01-02-2015 west
.
.
.
03-27-2015 west

现在我希望表A的最大值小于表B的每一行。样本输出应为

2013-01-10   2015-01-01 west
2015-01-01   2015-01-02 west
.
.
2015-01-02   2015-01-10 west
2015-01-10   2015-01-11 west

我不知道如何迭代表B的每一行,我的查询是

select max(effect_date), region from TableA b join TableA a on a.region = b.region where effect_date < run_date; --returning only 1 row

任何帮助将不胜感激。感谢

1 个答案:

答案 0 :(得分:0)

如果第二个表没有间隙,那么您可以使用连接和累积最大值:

select a.*,
       max(a.effect_date) over (partition by b.region
                                order by b.run_date
                               ) as effect_date
from b left join
     a
     on b.run_date = a.effect_date and b.region = a.region;