查询:
records = LOAD 'input' using PigStorage(' ') as (id:int, name:chararray, desination:chararray, date:chararray, salary: long);
示例输入:
(10102,neha,developer,14/02/13,32000)
(10103,deva,admin,02/02/14,40000)
(10102,neha,developer,01/01/14,45000)
(10245,sasi,developer,01/01/14,20000)
(10109,surya,manager,01/02/2014,56000)
(10102,neha,developer,01/02/2014,45000)
(10245,sasi,developer,02/01/2014,25000)
我想根据日期年份(非整个日期)过滤上述数据。
答案 0 :(得分:1)
检查这是否适合您。
(2013,{(neha,developer,2013)})
(2014,{(deva,admin,2014),(neha,developer,2014),(sasi,developer,2014),(surya,manager,2014),(neha,developer,2014),(sasi,developer,2014)})
最终输出将是:
order flow: 2 2 2 2 2 2 2 2 2 2 2 2
time (mn) : 5 10 15 20 25 30 35 40 45 50 55 60
orders done: 1 2 3 5 6 7 8 10 11 12 14 15