如何对pig

时间:2016-06-24 18:30:51

标签: hadoop apache-pig

我正在尝试计算原始数据的同比增长。通过使用针对(铅)我可以获得去年的数据以及当前年度数据。但我无法对stitch over()子句返回的列进行任何计算。以下是我到目前为止所尝试的内容,

grunt> data = LOAD 'loan_pig' USING org.apache.hive.hcatalog.pig.HCatLoader();  
grunt> grp1 = group data by issue_yr;  
grunt> tot_loan = foreach grp1{ cnt = COUNT(data.id); generate FLATTEN(group) as issue_yr,cnt as ln_cnt;};  
grunt> grp2 = group tot_loan all;  
grunt> loan_yr = foreach grp2{ srt = order tot_loan by issue_yr desc; generate FLATTEN(Stitch(srt, Over(srt.ln_cnt,'lead',0,1,1,0)));};  
grunt> final = foreach loan_yr generate issue_yr,ln_cnt,$2;  
grunt> describe final;

当我在决赛中描述时,它会显示

final: {stitched::issue_yr: int,stitched::ln_cnt: long,NULL}

NULL代表“潜在客户”值列。 当我尝试对此列进行任何数学计算时,它会抛出以下错误:

grunt> final1 = foreach loan_yr generate issue_yr,ln_cnt,$2 as pr_yr;  
grunt> fn = foreach final1 generate issue_yr,ln_cnt-pr_yr; 
  

2016-06-24 11:23:42,118 [main] ERROR org.apache.pig.tools.grunt.Grunt    - ERROR 1052:无法将bytearray转换为long

任何人都可以告诉我是否有任何方法可以使用Stitch ... Over返回的列进行计算。或者他们甚至可以在猪身上?

0 个答案:

没有答案