解析猪和猪的班级是什么? hive命令进入Map Reduce作业, 这个解析背后的算法是什么?
答案 0 :(得分:4)
Pig和Hive都使用ANTLR来构建解析脚本的编译器。如果您不熟悉编译器理论,我建议您阅读一些相关材料。
对于Pig,ANLTR的源代码为src/org/apache/pig/parser/QueryLexer.g
和src/org/apache/pig/parser/QueryParser.g
。它们将编译为org.apache.pig.parser.QueryLexer
和org.apache.pig.parser.QueryParser
。但是,这两个类用于将Pig脚本编译为抽象语法树。然后它将转换为org.apache.pig.newplan.logical.relational.LogicalPlan
。之后,LogcialPlan将转换为org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan
。这里我列出了一些相关的源文件:
org.apache.pig.newplan.logical.relational.LogicalPlan
org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.MROperPlan
org.apache.pig.parser.QueryParserDriver.parse(String)
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(LogicalPlan, Properties)
org.apache.pig.PigServer.launchPlan(PhysicalPlan, String)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(PhysicalPlan, PigContext)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(MROperPlan, MapReduceOper, Configuration, PigContext)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(MROperPlan, String)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(PhysicalPlan, String, PigContext)
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.constructLROutput(List<Result>, List<Result>, Tuple)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.Map.collect(Context, Tuple)
对于Hive,ANLTR的源代码是ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
。它将编译为org.apache.hadoop.hive.ql.parse.HiveLexer
和org.apache.hadoop.hive.ql.parse.HiveParser
。这两个类用于将Hive脚本编译为抽象语法树。然后它将转换为org.apache.hadoop.hive.ql.QueryPlan
。 Hive中的mapper和reducer是ExecMapper和ExecReducer。
这里我列出了一些相关的源文件:
org.apache.hadoop.hive.cli.CliDriver
org.apache.hadoop.hive.ql.Driver
org.apache.hadoop.hive.ql.Driver.run(String)
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(String, Context)
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(String, Context)
org.apache.hadoop.hive.ql.parse.ASTNode
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer
org.apache.hadoop.hive.ql.QueryPlan
org.apache.hadoop.hive.ql.Driver.compile(String, boolean)
org.apache.hadoop.hive.ql.exec.TaskRunner
org.apache.hadoop.hive.ql.Driver.execute()
org.apache.hadoop.hive.ql.exec.ExecDriver
org.apache.hadoop.hive.ql.exec.ExecMapper
org.apache.hadoop.hive.ql.exec.ExecReducer
org.apache.hadoop.hive.ql.exec.MapOperator
最后,我建议你下载他们的源代码并在eclipse中浏览它们,找出你想知道的任何问题。