使用猪过滤掉城市,年份和温度?

时间:2017-11-30 11:32:26

标签: apache-pig hadoop2

记录:

factor_columns %>% 
  gather(factor, level) %>%
  ggplot(aes(level)) + geom_bar() + facet_wrap(~factor, scales = "free_x")

我正在使用的脚本:

Pune,2007,31.5
Pune,2007,30.5
Pune,2008,34.5
Blre,2009,13.0
Blre,2009,10.5

输出:

grunt> A = LOAD '/home/cloudera/temp' using PigStorage(',') AS (city:chararray,year:int,temp:double);
grunt> B = group A by city;
grunt> C = FOREACH B GENERATE group, MAX(A.temp);

预期产出:

 Pune, 34.5
 Blre, 13.0

如果提前感谢,我怎样才能达到这个效果。

1 个答案:

答案 0 :(得分:0)

按城市和年份分组。

A = LOAD '/home/cloudera/temp' using PigStorage(',') AS (city:chararray,year:int,temp:double);
B = group A by (city,year);
C = FOREACH B GENERATE FLATTEN(group) AS (city,year), MAX(A.temp);