AVG对分组数据抛出ERROR 1046:使用显式强制转换

时间:2015-02-04 00:48:39

标签: hadoop mapreduce apache-pig bigdata

我在txt文件中有MAP个数据:

[age#27,height#5.8]
[age#25,height#5.3]
[age#27,height#5.10]
[age#25,height#5.1]

我想显示每组年龄的平均身高。

这是LAOD声明:

records = LOAD '~/Documents/Pig_Map.txt' AS (details:map[]);
records: {details: map[]}

然后我根据年龄对数据进行分组:

group_data = GROUP records BY details#'age';
group_data: {group: bytearray,records: {(details: map[])}}

用于访问details我这样做FLATTEN(不确定我是否需要这个步骤):

flatten_records = FOREACH group_data GENERATE group,FLATTEN(records);
flatten_records: {group: bytearray,records::details: map[]}

DUMP flatten_records这给我以下输出:

(25,[height#5.1,age#25])
(25,[height#5.3,age#25])
(27,[height#5.10,age#27])
(27,[height#5.8,age#27])

现在我想得到平均身高;我试过这个:

display_records = FOREACH flatten_records GENERATE group,AVG(records.details#'height');

错误是:

<line 10, column 57> Multiple matching functions for org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}}, {{(double)}}). Please use an explicit cast.

请建议。

1 个答案:

答案 0 :(得分:2)

你能试试吗?

records = LOAD '~/Documents/Pig_Map.txt' AS (details:map[]);
records1 = FOREACH records GENERATE details#'age' AS age,details#'height' AS height;
group_data = GROUP records1 BY age;
display_records = FOREACH group_data GENERATE group,AVG(records1.height);
dump display_records;

<强>输出:

(25,5.199999999999999)
(27,5.449999999999999)