我与3个领域的猪关系为:
A = Load 'record.txt' as (name chararray,ID int,subject chararray,flag boolean);<br>
DUMP A;
( RAM,222,JAVA,true)
( RAM,111,DotNet,false)
( RAM,444,HTML,false)
( SAM,777,DotNet,true)
( SAM,333,JAVA,false)
如何使用名称和ID的连接生成额外字段作为参考, 当flag为true时,否则它将是重复,直到next true出现,如下所示:
( RAM,222,JAVA,true,RAM-222)
( RAM,111,DotNet,false,RAM-222)
( RAM,444,HTML,false,RAM-222)
( SAM,777,DotNet,true,SAM-777)
( SAM,333,JAVA,false,SAM-777)
使用下面的脚本,但它没有给出确切的结果。
A = Load 'demo.txt' as (name chararray,ID int,subject chararray,flag boolean);
B = FOREACH A GENERATE name,ID,subject,flag,CONCAT(name,ID) As reference;
DUMP B;
( RAM,222,JAVA,true,RAM-222)
( RAM,111,DotNet,false,RAM-111)
( RAM,444,HTML,false,RAM-444)
( SAM,777,DotNet,true,SAM-777)
( SAM,333,JAVA,false,SAM-333)
什么应该是CONCAT功能或任何其他方式来获得确切的结果?
答案 0 :(得分:1)
A = Load 'demo.txt' as (name chararray,id int,sub chararray,flg boolean);
B = FOREACH A GENERATE name,id,sub,flg,CONCAT(name,ID) As rf;
split B into b1 if flg=='true', b2 if flg=='false';
C = join b2 by name left outer,b1 by name;
C1 = foreach C generate b2::name as name,b2::id as id,b2::sub as sub,b2::flg as flg,b1::rf as rf;
Result = union b1,C1;
希望这会有所帮助!!