这是我的输出文件,我用另一个Pig脚本写了:
1 3,5
2 4,6,7
我试图将每一行解析为(chararray,tuple)
data = load 'test45' as (x:chararray, y:tuple());
但是当我试图抛弃元组时,它们会被清空:
rows = foreach data generate y;
() ()
答案 0 :(得分:1)
试试这个。
X = LOAD 'pigtuple.txt' AS (str:chararray);
X1 = FOREACH X GENERATE FLATTEN(STRSPLIT(str, '\\s+')) AS (id:int, attr:chararray);
X3 = FOREACH X1 GENERATE id, STRSPLIT(attr, ',') AS (y:tuple());
X4 = foreach X3 GENERATE id,y;
dump X4;
如果你想访问元组中的每个元素。
X4 = foreach X3 GENERATE y.$0,y.$1;