left使用分区连接相同的结构化表

时间:2018-04-03 12:51:33

标签: sql hive left-join partitioning

我有两个相同的架构表: -

Table A : vendorname,branch,amount,region (partitioned by year,month,day)
Table B : vendorname,branch,amount,region (partitioned by year,month,day)

表A中的数据:

john,c1,112,us
 john,c2,113,uk
 john,c3,199,aus

表B中的数据:

john,c1,112,us
  john,c2,113,uk
  john,c3,99,aus
  john,c4,144,br
  john,c5,50,cr

输出:

john,c3,199,99,aus ==> mismatch for 199 and 99

需要比较表A到表B中的每条记录。在b中可以有其他记录。 我正在尝试左连接,但无法做到。

查询已尝试:

select * from (
(select vendorname,type,amount,region from A 
where vendorname='john' and  year='2018' and month='01' and day='01' ) t1
left join
(select vendorname,type,amount,region
from B
where vendorname='john' and  year='2018' and month='01' and day='01')t2
on (a.name=b.name and a.type=b.type))

但是为匹配列获取空值

我无法查询整个表格,因为我们需要从特定分区中选择数据,否则会影响性能

4 个答案:

答案 0 :(得分:0)

使用完全加入。它相当于左连接联合右连接,并将为您提供两个表中的所有数据。

select * from table1 a full join table2 b on a.vendorname=b.vendorname and a.branch=b.branch and a.amount=b.amount and a.region=b.region
where...

答案 1 :(得分:0)

select b.* from A inner join B
on a.vendorname=b.vendorname
and a.branch=b.branch
where a.amount<>b.amount

点击此处 - http://sqlfiddle.com/#!9/f8a352/5

答案 2 :(得分:0)

你可以尝试这段代码:

SELECT *
FROM tableA A full OUTER JOIN TABLEB B ON a.vendorname = B.vendorname AND  a.branch =b.branch 
WHERE EXISTS(
SELECT 1
FROM tableA A RIGHT JOIN tableB B ON a.vendorname = B.vendorname 
AND a.branch =b.branch 
AND a.amount<>b.amount 
AND a.region=b.region)

答案 3 :(得分:0)

取决于所需的输出

  1. 供应商名称和分支匹配但数量不匹配的所有情况

    从中选择t2。* ( (从A中选择vendorname,分支,金额,地区 vendorname =&#39; john&#39;和年=&#39; 2018&#39;和月份=&#39; 01&#39;和日=&#39; 01&#39; )t1 内部联接 (选择vendorname,分支,金额,地区 来自B. vendorname =&#39; john&#39;和年=&#39; 2018&#39;和月份=&#39; 01&#39;和天=&#39; 01&#39;)t2 on(t1.vendorname = t2.vendorname和t1.branch = t2.branch)) 其中t1.amount&lt;&gt; t2.amount

  2. 所有卖方名称和分支匹配但数量不匹配的情况+表A中的附加记录+表B中的附加记录

    从中选择t1。,t2。 (从A中选择vendorname,分支,金额,地区 vendorname =&#39; john&#39;和年=&#39; 2018&#39;和月份=&#39; 01&#39;和日=&#39; 01&#39; )t1 全外连接 (选择vendorname,分支,金额,地区 来自B. vendorname =&#39; john&#39;和年=&#39; 2018&#39;和月份=&#39; 01&#39;和天=&#39; 01&#39;)t2 on(t1.vendorname = t2.vendorname和t1.branch = t2.branch)) 其中t1.amount&lt;&gt; t2.amount或t1.amount为null或t2.amount为null