数据架构:sdesc:chararray,samt:init,syear:chararrary,stype:chararrary
数据:
Wrench 259000 2000 store
Wrench 135000 2000 online
Wrench 175000 2001 online
Wrench 180000 2001 store
脚本
ysales =LOAD ‘salesdata.txt’ using PigStorage()as (sdesc:chararray,samt:init,syear:chararrary,stype:chararrary);
basedata = FILTER ysales by (sdesc==’Wrench’) and (syear = ‘2000’ ) and (stype = ‘store);
我的结果集是:DUMP basedata;
(Wrench,259000,2000,store)
所以问题是我如何分解基础数据(例如)A = ‘Wrench’ B = 259000, C=2000, D = ‘store’
答案 0 :(得分:0)
您可以使用参数编号根据列
提取值a = foreach basedata generate $0;
b = foreach basedata generate $1;
c = foreach basedata generate $2;
d = foreach basedata generate $3;
答案 1 :(得分:0)
data = load '/home/satish/wrench' using PigStorage(' ') as (name,total,year,type) ;
//如果您想使用,可以使用过滤器
reqdata = foreach data generate CONCAT('A','=',name) as A, CONCAT('B','=',total) as B, CONCAT('C','=',year) as C,CONCAT('D','=',type) as D;
dump reqdata
;
(A=Wrench,B=259000,C=2000,D=store)
(A=Wrench,B=135000,C=2000,D=online)
(A=Wrench,B=175000,C=2001,D=online)
(A=Wrench,B=180000,C=2001,D=store)
fdata = foreach reqdata generate A,B;
dump fdata
(A=Wrench,B=259000)
(A=Wrench,B=135000)
(A=Wrench,B=175000)
(A=Wrench,B=180000)
\如果你想删除元组使用FLATTEN