如何将关系中的多个行组合成一个元组,以便在PIG Latin中执行计算

时间:2016-02-26 07:37:29

标签: apache-pig

我有以下代码:

pitcher_res = UNION pitcher_total_salary,pitcher_total_appearances;
dump pitcher_res;

输出结果为:

(8965000.0)
(22.0)

但是,我想计算8965000.0 / 22.0,所以我需要这样的东西:

res = FOREACH some_relation GENERATE $0/$1;

因此我需要some_relation =(8965000.0,22.0)。我该如何进行这样的转换?

3 个答案:

答案 0 :(得分:0)

你可以CROSS

  

计算两个或更多关系的叉积。

https://pig.apache.org/docs/r0.11.1/basic.html#cross

答案 1 :(得分:0)

理想情况下,您的源关系中的每个条目都有一个唯一标识符。然后,您可以根据此标识符执行连接,从而产生您希望拥有的关系类型。

薪酬关系

salaries: pitcher_id, pitcher_total_salary

总出场关系

appearances: pitcher_id, pitcher_total_appearances

加入

pitcher_relation = join salaries by pitcher_id, appearances by pitcher_id;

计算

res = FOREACH pitcher_relation GENERATE pitcher_total_salary/pitcher_total_apperances;

答案 2 :(得分:0)

下面的猪拉丁文字肯定会帮助你解决问题:

加载薪水档案

salary = load '/home/abhishek/Work/pigInput/pitcher_total_salary' as (salary:long); 

加载外观文件

appearances = load '/home/abhishek/Work/pigInput/pitcher_total_appearances' as (appearances:long);

现在,使用CROSS命令

C = cross salary, appearances

然后,最终输出

res = foreach C generate salary/appearances;   

输出

dump res
407500

希望这有帮助