使用猪产生最大数量

时间:2014-02-21 02:26:32

标签: apache-pig

我试图为以下数据找到生成年份,MAX(数字)并且它给出了错误说明

错误1045:无法将org.apache.pig.builtin.MAX的匹配函数推断为多个或不适合。请使用明确的演员。

我使用了命令

loadfirstoutput = load '/outt/part-r-00000' as (year:chararray, number:chararray); 
foreach2 = foreach loadfirstoutput generate year, MAX(number);  
dump foreach2;

 ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.MAX as multiple or none of them fit. Please use an explicit cast.

"   8
A"  6
"0" 4004
Ng" 1
1)" 1
Co" 5
/i>"    12
#4)"    1
&amp    2
21)"    1
22)"    2
38)"    1
80)"    1
Now"    1
Son"    1
"Unk"   1
Budd"   1
Food"   1
Ginn"   1
Hate"   1
Jax)"   1
Lang"   1
More"   1
Ross"   1
Sans"   1
Sign"   2
Sons"   1
Stan"   1
"1378"  1
"1806"  1
"1900"  2
"1901"  5
"1902"  2
"1904"  1
"1906"  1
"1908"  1
"1909"  2
"1910"  1
"1911"  14
"1914"  1
"1917"  1
"1920"  29
"1921"  2
"1923"  10
"1924"  2

2 个答案:

答案 0 :(得分:3)

有点难以说明你的数据是怎么回事。但假设它是模式建议你需要先分组。

loadfirstoutput = load '/outt/part-r-00000' as (name:chararray, year:chararray, number:chararray); 
A = GROUP  loadfirstoutput ALL;
B = FOREACH A GENERATE MAX(loadfirstoutput.number);  
dump B;

这将为您提供最大“数字”

如果您想要每年的最大数量

loadfirstoutput = load '/outt/part-r-00000' as (name:chararray, year:chararray, number:chararray); 
    A = GROUP  loadfirstoutput BY year;
    B = FOREACH A GENERATE MAX(loadfirstoutput.number);  
    dump B;

答案 1 :(得分:0)

这不能回答这个问题,但是在MAX混合了不同类型的情况下,在另一种情况下发生了同样的错误:

FOREACH fltrd GENERATE ids, MAX(TOBAG(suu, 1)) AS uu;

其中suu是一个长字段。

我必须通过将L添加到1来将int 1转换为long 1:

FOREACH fltrd GENERATE ids, MAX(TOBAG(suu, 1L)) AS uu;