执行我的PIG脚本后:
FILE = LOAD 'PATH_FILE'
USING PigStorage(',') as
(ID:Long,
MUNICIPALITY:String,
CITY:Int,
COUNTRY:Int,
COMPANY:Long,
BRAND:Long,
DATE:Chararray,
STOCK_NAME:Chararray,
STOCK_SIZE:Double,
STOCK_AMOUNT:Double);
DATA = GROUP FILE BY (ID,MUNICIPALITY);
GRP_DATA = FOREACH DATA GENERATE group as STOCK_ID, FILE.COMPANY as COMPANY, FILE.BRAND as BRAND,FILE.DATE as DATE, FILE.STOCK_NAME AS STOCK_NAME, SUM(FILE.STOCK_AMOUNT) as STOCK_AMOUNT;
RANKING = rank GRP_DATA by STOCK_NAME,COMPANY,BRAND;
STORE RANKING INTO 'PATH_DESTINATION USING PigStorage(',');
我得到了这个输出:
1,(7287026502032012,18),{(706)},{(101200010)},{(17286)},{(oz)},2.5
我如何使用PIG可以获得这一行:
1,7287026502032012,18,706,101200010,17286,oz,2.5
可以退货吗?
非常感谢!!
答案 0 :(得分:0)
您可以使用正则表达式删除所有(
,)
,{
和}
:
[(){}]+
请参阅regex demo。
在PIG中:
A = LOAD 'input.txt' as line;
B = FOREACH A GENERATE REPLACE(line,'[(){}]+','');
dump B;