我在猪身上有两个关系:
DUMP A;
Sandeep Rohan Mohan
DUMP B;
MOHAN
我需要输出A - B; 关系C应该给我
和Sandeep,罗汉
因为他们没有出现在B
答案 0 :(得分:0)
试试这个:
A1 = LOAD 'Sandeep Rohan Mohan' USING PigStorage() AS (line:chararray);
B1 = LOAD 'MOHAN' USING PigStorage() AS (line:chararray);
A = FOREACH A1 GENERATE UPPER(line) AS line;
B = FOREACH B1 GENERATE UPPER(line) AS line;
C = COGROUP A BY line, B BY line;
D = FILTER C BY IsEmpty(B);
E = FOREACH D GENERATE group AS name;
DUMP E;
(ROHAN)(SANDEEP)
答案 1 :(得分:0)
使用左外连接实现它,只考虑那些在$ 1中有空值的元组