将带有字符串数组的元组拆分为多个元组

时间:2015-02-21 08:01:09

标签: apache-pig

我有像

这样的元组
A = 
(1, ["Football","Baseball"])
(2, ["Swimming","Baseball"])

我想基于字符串数组拆分元组,以便最终结果是这样的

(1, "Football")
(1, "Baseball")
(2, "Swimming")
(2, "Baseball")

我怎样才能在猪身上做到这一点?

1 个答案:

答案 0 :(得分:0)

首先使用'['函数从输入中删除']'REPLACE字符,然后将输出包装到bagflatten中。

<强>输入

1,["Football","Baseball"]
2,["Swimming","Baseball"]

<强> PigScript:

A = LOAD 'input' USING PigStorage(',') AS (f1:int,f2:chararray,f3:chararray);
B = FOREACH A GENERATE f1,FLATTEN(TOBAG(REPLACE(f2,'[\\[\\]]',''),REPLACE(f3,'[\\[\\]]','')));
DUMP B;

<强>输出:

(1,"Football")
(1,"Baseball")
(2,"Swimming")
(2,"Baseball")