我正在尝试过滤位置变量。
X = FILTER C BY($14 matches '.*USD.*');
STORE X into '$output' using PigStorage(',');
以上陈述不起作用,但如果我尝试输出$ 14
E = FOREACH C GENERATE FLATTEN($14);
STORE C into '$output' using PigStorage(',');
工作正常
示例数据:
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD120
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD0
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,GBP0
示例输出
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,USD0
304a285281be,1383027928890968764,receiver,10C,655362,C2,USD811289,1,0,0,ebay_checkout,cc,cc,USD2659,GBP0
答案 0 :(得分:0)
在'BY'和'('
之间添加一个空格 X = FILTER C BY (FLATTEN($14) matches '.*USD.*');
STORE X into '$output' using PigStorage(',');
答案 1 :(得分:0)
这对我来说很有意义:
A = LOAD 'StackFile.txt' using PigStorage(',');
B = FILTER A BY ($14 matches '.*USD.*');
DUMP B;
块引用
304a285281be,1383027928890968764,接收器,如图10C所示,655362,C2,USD811289,1,0,0,ebay_checkout,CC,CC,USD2659,USD120 304a285281be,1383027928890968764,接收器,如图10C所示,655362,C2,USD811289,1,0,0,ebay_checkout,CC,CC,USD2659,USD0