我需要将Value列转换为按城市分组的单行,并用" |"分隔。 (管道)字符
DATA = LOAD' /tmp/test.dat'使用PigStorage(',')为( 城市:chararray, VALUE:chararray )
输入:(城市/值)
ISTANBUL,1
ISTANBUL,2
ISTANBUL,3
NEWYORK,8
NEWYORK,9
输出:
ISTANBUL,1 | 2 | 3
NEWYORK,8 | 9
答案 0 :(得分:2)
首先在CITY
上执行一个组,然后使用BagToString(http://pig.apache.org/docs/r0.15.0/func.html#bagtostring)将每个组的值转换为所需的字符串表示形式。像(未经测试的!)
data = LOAD '/tmp/test.dat' using PigStorage(',') AS (city:chararray, value:chararray);
data_grp = GROUP data BY city;
result = FOREACH data_grp GENERATE group AS city, BagToString(data.value, '|') AS values;