假设我有set_of_values:
a, k
a, l
a, m
b, x
b, y
b, z
如果我使用
a = RANK set_of_values;
我明白了:
1, a, k
2, a, l
3, a, m
4, b, x
5, b, y
6, b, z
我想要达到的是RANK,但在团队内部 第一:
a = group set_of_values by first_value;
(a,{(a,k),(a,l),(a,m)})
(b,{(b,x),(b,y),(b,z)})
我现在该怎么办才能得到:
(a,{(1,a,k),(2,a,l),(3,a,m)})
(b,{(1,b,x),(2,b,y),(3,b,z)})
编辑(在foreach中添加了RANK)
b = foreach a { c = RANK $1; generate c; }
我明白了:
2014-03-05 09:55:05,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 5, column 20> Syntax error, unexpected symbol at or near 'RANK'
Details at logfile: /export/home/pig/pig_1394009645035.log
日志文件:
ERROR 1200: <line 5, column 20> Syntax error, unexpected symbol at or near 'RANK'
Failed to parse: <line 5, column 20> Syntax error, unexpected symbol at or near 'RANK'
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1571)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1544)
at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
答案 0 :(得分:1)
这对于回复来说可能为时已晚,但我发现有人只在堆栈溢出时将其发布: Usage of Apache Pig rank function
P.S:他正在使用DataFu UDF:Enumerate,它对我有用。