child = load 'file_name' using PigStorage('\t') as (child_code : chararray, child_id : int, child_precode_id : int);
parents = load 'file_name' using PigStorage('\t') as (child_id : int, child_internal_id : chararray, mother_id : int, father_id : int);
joined = JOIN child by child_id, parents by child_id;
mainparent = FOREACH joined GENERATE child_id as child_id_source, child_precode_id, child_code;
store parent into '(location of file)' using PigStorage('\t');
childfirst = JOIN mainparent by (child_id_source), parents by (mother_id OR father_id);
firstgen = FOREACH childfirst GENERATE child_id, child_precode_id, child_code;
store firstgen into 'file_location' using PigStorage('\t');
使用OR条件时出现以下错误:
错误org.apache.pig.PigServer - 解析期间的异常:错误 在解析期间。 Pig脚本无法解析: NoViableAltException(91 @ [])无法解析:Pig脚本失败 解析:NoViableAltException(91 @ [])
答案 0 :(得分:1)
以下语法不正确,Pig
中没有条件连接childfirst = JOIN mainparent by (child_id_source), parents by (mother_id OR father_id);
如果您想在两个键上加入一个键与另一个键的关系,那么创建两个连接并合并数据集。注意您可能必须区分结果关系。
childfirst = JOIN mainparent by (child_id_source), parents by (mother_id);
childfirst1 = JOIN mainparent by (child_id_source), parents by (father_id);
childfirst2 = UNION childfirst,childfirst1;
childfirst3 = DISTINCT childfirst2;
firstgen = FOREACH childfirst3 GENERATE child_id, child_precode_id, child_code;
store firstgen into 'file_location' using PigStorage('\t');