pig_concat相当于猪吗?

时间:2013-09-13 07:02:18

标签: mysql hadoop apache-pig

尝试在Pig上完成此操作。 (寻找与MySQL相当的group_concat())

在我的表中,例如,我有:(3fields- userid,clickcount,pagenumber)

155 | 2 | 12
155 | 3 | 133
155 | 1 | 144
156 | 6 | 1
156 | 7 | 5

所需的输出是:

155| 2,3,1 | 12,133,144

156| 6,7 | 1,5

如何在PIG上实现这一目标?

1 个答案:

答案 0 :(得分:9)

grouped = GROUP table BY userid;
   X = FOREACH grouped GENERATE group as userid, 
                                table.clickcount as clicksbag, 
                                table.pagenumber as pagenumberbag;

现在X将是:

{(155,{(2),(3),(1)},{(12),(133),(144)},
 (156,{(6),(7)},{(1),(5)}}

现在您需要使用builtin UDF BagToTuple

output = FOREACH X GENERATE userid, 
                            BagToTuple(clickbag) as clickcounts, 
                            BagToTuple(pagenumberbag) as pagenumbers;

output现在应该包含您想要的内容。您也可以将输出步骤合并到合并步骤中:

    output = FOREACH grouped GENERATE group as userid, 
                     BagToTuple(table.clickcount) as clickcounts, 
                     BagToTuple(table.pagenumber) as pagenumbers;