pig latin - 从单行输入创建多行输出

时间:2014-04-08 03:40:03

标签: hadoop apache-pig

我输入的数据如下:

  

Row1 | 2014-04-04 18:46:18 | 37.52 | -84.34 | 30870 | 580.372 ms,759.065 ms,   695.879 ms

     

Row2 | 2014-04-04 18:47:18 | 37.68 | -84.34 | 31127 | 619.341 ms,725.121 ms,   696.790毫秒

如何进入多行,如下所示:

  

Row1 | 2014-04-04 18:46:18 | 37.52 | -84.34 | 30870 | 580.372 ms

     

Row1 | 2014-04-04 18:46:18 | 37.52 | -84.34 | 30870 | 759.065 ms

     

Row1 | 2014-04-04 18:46:18 | 37.52 | -84.34 | 30870 | 695.879 ms

     

Row2 | 2014-04-04 18:47:18 | 37.68 | -84.34 | 31127 | 619.341 ms

     

Row2 | 2014-04-04 18:47:18 | 37.68 | -84.34 | 31127 | 725.121 ms

     

Row2 | 2014-04-04 18:47:18 | 37.68 | -84.34 | 31127 | 696.790 ms

提前致谢

1 个答案:

答案 0 :(得分:0)

您可以使用FLATTEN。如,

a = load 'test.txt' using PigStorage('|') as (c1: chararray, c2: chararray, c3: double, c4: double, c5: long, c6: chararray);
b = foreach a generate c1, c2, c3, c4, c5, STRSPLIT(c6, ', ', 3) as c6;
c = foreach b generate c1, c2, c3, c4, c5, FLATTEN(c6) as (c6: chararray, c7: chararray, c8: chararray);
dump c;