过滤并更换猪的一列

时间:2013-09-13 23:36:59

标签: filter foreach replace apache-pig

假设我的数据看起来像

row1 cats val12 val13
row2 dogs val22 val23
row3 cats val32 val33
...

data = load 'file' AS (row:chararry, pets:charray, val2:charray, val3:charray);

过滤数据只保存'猫'行

felines = filter data by (pets matches 'cats');

现在将'猫'改为'狮子'

lions = foreach felines generate replace (pets, 'cats', 'lions');
dump lions;

(lions)
(lions)
...

我的目标是创建新行以添加到我的表

newFelines = foreach lions generate rows, lions, val1, val2;
                                    Error ^^^^^
"Error during parsing. Scalars can be only used with projections"

如何获得包含以下新行的集合?

row1 lions val11 val12
row3 lions val31 val32

TIA,

1 个答案:

答案 0 :(得分:3)

逐行:

没有'chararry'或'charray'数据类型:

data = load 'file' USING  PigStorage(' ')  AS 
    (row:chararray, pets:chararray, val2:chararray, val3:chararray);

提取'猫':

felines = filter data by (pets matches 'cats');

用'狮子'取代'猫'可以这样做:

lions = foreach felines generate row, REPLACE(pets, 'cats', 'lions'), val2, val3;

或者那样:

lions = foreach felines generate row, 'lions', val2, val3;