数据1
1,a
2,b
3,c
4,d
5,e
数据2
1,a
2,g
3,j
4,b
5,c
6,d
7,e
脚本
a = load '/tmp/data/data1' using PigStorage(',') as (timestamp:chararray,constant:chararray);
b = load '/tmp/data/data2' using PigStorage(',') as (timestamp:chararray,constant:chararray);
我只需要输出不常见且存在于data2中的常量,如下所示
2,g
3,j
感谢您的帮助。
答案 0 :(得分:0)
RIGHT OUTER JOIN
和FILTER
其中a.timestamp为null。这将为您提供b中不在a中的所有记录。
c = JOIN a BY (timestamp) RIGHT OUTER,b BY (timestamp);
d = FILTER c BY (a::timestamp is null);
DUMP d;