I've the following code:
Input_File = load '/user/cloudera/teste' USING PigStorage (' ')
as (ID:Int,
Descrip:Chararray,
Date:Datetime);
groupped = group Input_File by (ID, Date);
ranked = foreach groupped {
ranked = rank groupped by ID desc DENSE;
generate flatten(ranked);
}
STORE ranked into '/user/cloudera/teste1123';
I'm trying to create rank column on this dataset:
id des date
1 A 01-01-2016
2 A 02-01-2016
2 C 03-01-2016
2 D 03-01-2016
3 A 01-01-2016
The main goal is to get this:
rank id desc date
1 1 A 01-01-2016
2 2 A 02-01-2016
3 2 C 03-01-2016
3 2 D 03-01-2016
4 3 A 01-01-2016
But when I'm running my code I'm getting the following error:
ERROR 1200: <line 5, column 14> Syntax error, unexpected symbol at or near 'groupped'
Failed to parse: <line 5, column 14> Syntax error, unexpected symbol at or near 'groupped'
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:241)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1660)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1633)
at org.apache.pig.PigServer.registerQuery(PigServer.java:587)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1090)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:547)
at org.apache.pig.Main.main(Main.java:158)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
What I am doing wrong?
Many thanks!
答案 0 :(得分:0)
Not sure where the syntax error is, however here is the solution you are looking for.
Note: You will have to load date as chararray
Input_File = load '/user/cloudera/teste' USING PigStorage(' ') as (ID:int, Descrip:chararray,Date:chararray);
ranked = rank Input_File by ID ASC,Date DENSE;
dump ranked;
Output