当有一个新的使用猪时,覆盖现有数据

时间:2017-02-03 05:08:25

标签: apache-pig

我有两张桌子:

1,'hello'
2,'world'
4,'this'

1,'john'
3,'king'

我想制作一张表

1,'john'
2,'world',
3,'king'
4,'this'

我需要将代码列合并,我该怎么做?感谢

1 个答案:

答案 0 :(得分:2)

获取仅在A中的记录,然后使用B来获取UNION。

A = LOAD 'test1.txt' USING PigStorage(',') AS (aid:int,aname:chararray);
B = LOAD 'test2.txt' USING PigStorage(',') AS (bid:int,bname:chararray);
C = JOIN A BY aid LEFT OUTER,B BY bid;
D = FILTER C BY bid is null;
E = FOREACH D GENERATE A::aid,A::aname;
F = UNION E,B;
DUMP F;

注意:如果您想按顺序排序,则对最终关系F进行排序。

G = ORDER F BY F.$0;
DUMP G;

<强>输出

enter image description here