猪的多个表的OUTER JOIN

时间:2012-10-25 20:31:37

标签: join outer-join apache-pig

我需要加入多个表格。我使用的命令如下:

G = JOIN aa BY f, bb by f, cc by f, dd by f;

为了使其成为一个完整的外部联接,我添加了一个FULL来实现它:

G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;

但它给了我一条mismatched input错误消息。我该怎么做才能做到这一点?

谢谢!

2 个答案:

答案 0 :(得分:6)

根据猪documentation

  

外连接仅适用于双向连接;执行多方面   外连接,您将需要执行多个双向外连接   语句。

答案 1 :(得分:1)

您可以使用COGROUP语句模仿完全外连接。例如,cogroup使用以下两个文件

Decimal.csv

first|1
second|2
fourth|4

Roman.csv

first|I 
second|II
third|III

猪命令:

english = LOAD 'Decimal.csv' using PigStorage('|') as (name:chararray,value:chararray);
roman = LOAD 'Roman.csv' using PigStorage('|') as (name:chararray, value:chararray);
multi = cogroup english by name, roman by name;
dump multi

输出:

(first,{(first,1)},{(first,I)})
(third,{},{(third,III)})
(fourth,{(fourth,4)},{})
(second,{(second,2)},{(second,II)})