Apache Pig:LIME在FOREACH内部引用toplevel字段,Scalar在输出中有多行

时间:2015-09-23 23:11:15

标签: foreach apache-pig limit

这个问题类似于两年前提出的问题,但由于某种原因,它对我不起作用。实际上,这是标题中给出的两个想法(已回答的问题)的组合。下面的示例复制了已接受的解决方案,但它对我不起作用:我的错误是什么?我给出了一个完整的自包含工作示例:

以下是数据:     cat in_detail.csv

grp,val
1,2.1,
1,4.2,
1,6.3
2,6.5
2,1.2
2,4.3
2,3.2

cat in_cnt.csv
grp,cnt
1,2
2,3

预期输出(排序顺序不重要):

grp,val
1,2.1,
1,4.2,
2,6.5
2,1.2
2,4.3

以下是代码:错误消息

detail1    = LOAD '/tmp/sD_mvmd/c0nelha/data/in_detail.csv'   using PigStorage(',') as (grp:chararray,num:double);

cnt1       = LOAD  '/tmp/sD_mvmd/c0nelha/data/in_cnt.csv' using     

PigStorage(',') as (grp:chararray,cnt:int);

d_group = GROUP detail1 by (grp);

describe d_group;

--d_group: {group: chararray,detail1: {(grp: chararray,num: double)}}

describe cnt1;

--cnt1: {grp: chararray,cnt: int}

detail2 = JOIN d_group by (group), cnt1 by (grp);

describe detail2;

--detail2: {d_group::group: chararray,d_group::detail1: {(grp: chararray,num: double)},cnt1::grp: chararray,cnt1::cnt: int}

detail3 = FOREACH detail2 {

    mySelection   = LIMIT d_group::detail1 detail2.cnt1::cnt;

    GENERATE mySelection;
}

-- Apache Pig version 0.12.1.2.1.5.0-695 (rexported) compiled Aug 27 2014, 23:56:19

-- Backend error : Scalar has more than one row in the output.

-- 1st : (1,{(1,6.3),(1,4.2),(1,2.1)},1,2), 2nd :(2,{(2,3.2),(2,4.3),(2,1.2),(2,6.5)},2,3)

dump detail3;

0 个答案:

没有答案