我正在使用pig中的以下代码分析群集用户日志文件:
t_data = load 'log_flies/*' using PigStorage(',');
A = foreach t_data generate $0 as (jobid:int),
$1 as (indexid:int), $2 as (clusterid:int), $6 as (user:chararray),
$7 as (stat:chararray), $13 as (queue:chararray), $32 as (projectName:chararray), $52 as (cpu_used:float), $55 as (efficiency:float), $59 as (numThreads:int),
$61 as (numNodes:int), $62 as (numCPU:int),$72 as (comTime:int),
$73 as (penTime:int), $75 as (runTime:int), $52/($62*$75) as (allEff: float), SUBSTRING($68, 0, 11) as (endTime: chararray);
---describe A;
A = foreach A generate jobid, indexid, clusterid, user, cpu_used, numThreads, runTime, allEff, endTime;
B = group A by user;
f_data = foreach B {
grp = group;
count = COUNT(A);
avg = AVG(A.cpu_used);
generate FLATTEN(grp), count, avg;
};
f_data = limit f_data 10;
dump f_data;
代码适用于group and COUNT
,但当我包含AVG和SUM时,它会显示错误:
错误org.apache.pig.tools.grunt.Grunt - ERROR 1066:无法打开 别名f_data的迭代器
我检查了数据类型。一切都很好。你有什么建议我错过了吗?预先感谢您的帮助。
答案 0 :(得分:1)
语法错误。阅读http://chimera.labs.oreilly.com/books/1234000001811/ch06.html#more_on_foreach(部分:嵌套的foreach)了解详情。
猪脚本
A = LOAD 'a.csv' USING PigStorage(',') AS (user:chararray, cpu_used:float);
B = GROUP A BY user;
C = FOREACH B {
cpu_used_bag = A.cpu_used;
GENERATE group AS user, AVG(cpu_used_bag) AS avg_cpu_used, SUM(cpu_used_bag) AS total_cpu_used;
};
输入: a.csv
a,3
a,4
b,5
输出
(a,3.5,7.0)
(b,5.0,5.0)
答案 1 :(得分:0)
你的猪充满了错误
使用PigLoader()作为(适当提及您的架构);
A = foreach A generate jobid, indexid, clusterid, user, cpu_used, numThreads, runTime, allEff, endTime;
改变这一点 F = foreach A生成jobid,indexid,clusterid,user,cpu_used,numThreads,runTime,allEff,endTime;
f_data = limit f_data 10; CHANGE以另一个名字离开F_data。
不要让你的生活变得复杂。 调试Pigscript的一般规则
写一个样本猪模仿你的猪:(工作)
t_data = load' ./ file'使用PigStorage(',')作为(jobid:int,cpu_used:float);
C = foreach t_data generate jobid, cpu_used ;
B = group C by jobid ;
f_data = foreach B {
count = COUNT(C);
sum = SUM(C.cpu_used);
avg = AVG(C.cpu_used);
generate FLATTEN(group), count,sum,avg;
};
never_f_data = limit f_data 10;
dump never_f_data;