1)
q3_ReasonYearWise = FOREACH q3_SelectedColumnForReason GENERATE GetYear(application_dt)为Application_Year,loan_purpose;
2)
q3_Group_Reason_Year = GROUP q3_ReasonYearWise BY(Application_Year, loan_purpose);
3)
q3_Count_Reasons_Yearwise = FOREACH q3_Group_Reason_Year GENERATE group as me,COUNT(q3_ReasonYearWise。(Application_Year,loan_purpose)) 作为tot;
(2007,car) 5
(2007,house) 1
(2007,other) 53
(2007,moving) 6
(2007,medical) 2
(2008,car) 41
(2008,house) 16
(2008,other) 208
(2008,moving) 20
(2008,medical) 27
(2008,wedding) 44
(2008,vacation) 9
(2009,car) 170
(2009,house) 60
(2009,other) 595
(2009,moving) 58
(2009,medical) 84
(2009,wedding) 132
(2009,vacation) 26
所以在此之后如何找到每年的Max。我的输出必须像......
(2007, Other) 53
(2008,other) 208
(2009,other) 595
答案 0 :(得分:0)
你能这样试试吗?
q3_Count_Reasons_Yearwise = FOREACH q3_Group_Reason_Year GENERATE q3_ReasonYearWise.Application_Year as my_application_year ,group as me,COUNT(q3_ReasonYearWise。(Application_Year,loan_purpose))tot;
在你的第3个结束时你的输出应该是这样的,
2007 (2007,car) 5
2007 (2007,house) 1
2008 (2008,car) 41
2008 (2008,house) 16
之后就这样了。
A = GROUP q3_Count_Reasons_Yearwise BY my_application_year;
B = FOREACH A {
sortByMax = ORDER q3_Count_Reasons_Yearwise BY tot DESC;
topMax = LIMIT sortByMax 1;
GENERATE FLATTEN(topMax.$1),FLATTEN(topMax.$2);
}
DUMP B;