我的一个专栏(名为Product)被定义为Chararray,它有三个值:OT,AT和HP。我想创建一个新列并将其转换为整数:
为此我创建了一个foreach语句:
REGISTER '/usr/lib/pig/piggybank.jar';
File = load '/user/cloudera/file.csv'
USING org.apache.pig.piggybank.storage.CSVExcelStorage(',')
as (ID:Long,
Chain:Int,
Dept:Int,
Product_Measure:Chararray,
Price:Double);
Values = FOREACH File Generate
ID,
Chain,
Dept,
((Chararray)Product_Measure=='OT'?'1':(Chararray)Product_Measure=='AT'?'2':(Chararray)Product_Measure=='HP'?'3':'0') as Product_Measure,
(Price<0.1?0:Price) as Price;
Filter_Values = FILTER Values BY Price > 0;
DUMP Filter_Values;
如果删除thrid行它工作正常,所以我认为当我尝试在int中转换chararray时问题出现了。
任何人都可以帮助我吗?
谢谢!
答案 0 :(得分:0)
Values = FOREACH Source Generate
ID,
Date,
((Chararray)Product == 'OT' ? (int)1 : (Chararray)Product_Measure == 'AT' ? (int)2 : (Chararray)Product_Measure == 'HP' ? (int)3 : 0) as Product_Value,
(Quantity<0?0:Quantity) as Quantity,
(Price<0.1?0:Price) as Price;
或者如果你想要NULL那么
Values = FOREACH Source Generate
ID,
Date,
((Chararray)Product == 'OT' ? '1' : (Chararray)Product_Measure == 'AT' ? '2' : (Chararray)Product_Measure == 'HP' ? '3' : 'NULL') as Product_Value,
(Quantity<0?0:Quantity) as Quantity,
(Price<0.1?0:Price) as Price;
你需要在你的猪脚本中进行两次修改。
第1代=
刚刚放==
如果您想要null
值,请将其转换为chararray
,否则所有替换值均为int