Pig Latin查询使用group by和MAX函数

时间:2013-04-15 04:07:57

标签: apache-pig

鉴于表格:

Place(name, province, population, mayorid)

你会如何在Pig Latin中写下以下查询? 每个省的人口最多的地方返回。您的结果集应该包含省名,地名和该地区的人口。

1 个答案:

答案 0 :(得分:0)

没有测试过这个,但是像

那样
places = LOAD 'placesInput' AS (name, province, population, mayorid);
placesProjected = FOREACH places GENERATE name,province,population;
placesGrouped = GROUP placesProjected by province;
biggestPlaces = FOREACH placesGrouped {
    sorted = ORDER placesProjected by population DESC;
    maxPopulation = LIMIT sorted 1;
    GENERATE group as province, FLATTEN(maxPopulation.name) as name, FLATTEN(maxPopulation.population) as population;
};

ough work。