SAS Hadoop Hive SQL左联接不相等子句解决方法,而无需删除表A中的行

时间:2019-04-08 11:49:33

标签: hadoop sas left-join hiveql inequality

我在Hadoop中有2个表,并希望根据特定条件将表B留给A

联接基于'ID'(a.ID = b.ID),但如果b.status_date> = a.date,我只想从表B引入2列'status_date'和'flag_y'

表A:

+------------+-----+--------+
|    date    | ID  | Flag_x |
+------------+-----+--------+
| 01/03/2019 | 100 | x      |
| 01/03/2019 | 101 | x      |
| 02/03/2019 | 102 | x      |
| 02/03/2019 | 103 | x      |
+------------+-----+--------+

表B:

+-------------+---------+--------+
| status_date | field_x | Flag_y |
+-------------+---------+--------+
| 15/03/2019  |     100 | y      |
| 10/01/2019  |     102 | y      |
+-------------+---------+--------+

所需的输出:

+------------+-----+--------+-------------+--------+
|    date    | ID  | Flag_x | status_date | Flag_y |
+------------+-----+--------+-------------+--------+
| 01/03/2019 | 100 | x      | 15/03/2019  | y      |
| 01/03/2019 | 101 | x      |             |        |
| 02/03/2019 | 102 | x      |             |        |
| 02/03/2019 | 103 | x      |             |        |
+------------+-----+--------+-------------+--------+

代码我尝试了下面的操作,该操作删除了ID 102行。在这种情况下,我想保留此行,但不要从表B中获取信息,因为'status_date'在此之前表A中的“日期”。我假设需要在where子句中添加一些内容?

Create Table Output As 
Select
a.*
,b.status_date
,b.flag_y

From Table_A as a
Left join Table_B as b
On b.ID = a.ID

Where b.status_date is Null or b.status_date >= a.date

希望这很有意义,有人可以提供帮助

0 个答案:

没有答案