Apache Pig将行转换为以字符

时间:2016-03-09 15:01:19

标签: apache-pig

我需要将Value列转换为按城市分组的单行,并用" |"分隔。 (管道)字符

  

DATA = LOAD' /tmp/test.dat'使用PigStorage(',')为(         城市:chararray,         VALUE:chararray   )

输入:(城市/值)

ISTANBUL,1

ISTANBUL,2

ISTANBUL,3

NEWYORK,8

NEWYORK,9

输出:

ISTANBUL,1 | 2 | 3

NEWYORK,8 | 9

1 个答案:

答案 0 :(得分:2)

首先在CITY上执行一个组,然后使用BagToString(http://pig.apache.org/docs/r0.15.0/func.html#bagtostring)将每个组的值转换为所需的字符串表示形式。像(未经测试的!)

data = LOAD '/tmp/test.dat' using PigStorage(',') AS (city:chararray, value:chararray);
data_grp = GROUP data BY city;
result = FOREACH data_grp GENERATE group AS city, BagToString(data.value, '|') AS values;