使用PIG拉丁计算百分比

时间:2016-07-14 20:09:24

标签: apache-pig

我有一个包含两列的表(代码:chararray,sp:double)

我想计算每个sp的百分比。

INPUT
t001 60
a002 75
a003 34
bb04 56
bbc5 23
cc2c 45
ddc5 45

期望的输出:

code Perc
t001 17%
a002 22%
a003 10%
bb04 16.5%
bbc5 6%
cc2c 13.3%
ddc5 13.3%

我试过这样但是输出没有到来。

A = load '....' as (code : chararray, sp : double); 
B = GROUP A BY (code); 
allcount = FOREACH B GENERATE SUM(A.speed) as total; 
perc = FOREACH A GENERATE code,speed/(double)allcount.total * 100; 
dump perc;

我如何使用猪拉丁?

1 个答案:

答案 0 :(得分:0)

您正在将第二列加载到名为sp的字段中,但将其称为speed.I假设列由空格分隔,如果它是制表符,则在其中使用PigStorage('\ t') LOAD声明。

A = LOAD '/YourFilePath/YourFile.txt' USING PigStorage(' ') AS (code:chararray, sp:double); 
B = GROUP A ALL; 
C = FOREACH B GENERATE SUM(A.sp) AS total; 
D = FOREACH A GENERATE code,ROUND_TO((sp/(double)C.total) * 100,2) AS perc;
E = FOREACH D GENERATE code,CONCAT((chararray)perc,'%'); 
DUMP E;

<强>输出:

enter image description here