B = GROUP A BY state;
C = FOREACH B {
DA = ORDER A BY population DESC;
DB = LIMIT DA 5;
GENERATE FLATTEN(group), FLATTEN(DB.name), FLATTEN(DB.population);
}
问题是我得到了城市的名字5次而不是1.我得到的结果是:
(ALASKA,M,27257)
(ALASKA,M,23696)
(ALASKA,M,19949)
(ALASKA,M,19926)
(ALASKA,M,19833)
(ALASKA,H,27257)
(ALASKA,H,23696)
(ALASKA,H,19949)
(ALASKA,H,19926)
(ALASKA,H,19833)
我需要的输出是:
(ALASKA,M,27257)
(ALASKA,H,23696)
答案 0 :(得分:1)
2 flattens:FLATTEN(DB.name),FLATTEN(DB.population);在2个袋子之间产生Cartezian产品,用一个袋子替换它
B = GROUP A BY state;
C = FOREACH B {
DA = ORDER A BY population DESC;
DB = LIMIT DA 5;
GENERATE FLATTEN(group), FLATTEN(DB.(name, population));
}
或者,由GROUP BY创建的行李包含所有原始元组和所有列,您可以执行此操作:
B = GROUP A BY state;
C = FOREACH B {
DA = ORDER A BY population DESC;
DB = LIMIT DA 5;
GENERATE FLATTEN(DB);
}