Question

环境： HDP 2.3 Sandbox

问题：我在hive中创建了一个只有2列的表。现在我想在我的MR代码中使用HCatalog集成来阅读它。 MR作业无法从MySql元存储中读取该表。它出于某种原因使用Derby，因此它失败并显示“table not found”消息。

作业客户代码：

public class HCatalogMRJob extends Configured implements Tool {

   public int run(String[] args) throws Exception {
        Configuration conf = getConf();
        args = new GenericOptionsParser(conf, args).getRemainingArgs();
        String inputTableName = args[0];
        String outputTableName = args[1];
        String dbName = null;
        Job job = new Job(conf, "HCatalogMRJob");
        HCatInputFormat.setInput(job, dbName, inputTableName);
        job.setInputFormatClass(HCatInputFormat.class);
        job.setJarByClass(HCatalogMRJob.class);
        job.setMapperClass(HCatalogMapper.class);
        job.setReducerClass(HCatalogReducer.class);
        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(WritableComparable.class);
        job.setOutputValueClass(DefaultHCatRecord.class);
        HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null));
        HCatSchema s = HCatOutputFormat.getTableSchema(conf);
        System.err.println("INFO: output schema explicitly set for writing:"
                + s);
        HCatOutputFormat.setSchema(job, s);
        job.setOutputFormatClass(HCatOutputFormat.class);
        return (job.waitForCompletion(true) ? 0 : 1);
    }
    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new HCatalogMRJob(), args);
        System.exit(exitCode);
    }
}

作业运行命令：

hadoop jar mr-hcat.jar input_table out_table

在运行此命令之前，我已使用hadoop_classpath变量在类路径中设置了必要的hcatalog，hive jar。

问题：

现在，我如何正确使用hive-site.xml？

我尝试使用与上面提到的相同的hadoop_classpath在类路径中设置它。但是它仍然失败。

Mapreduce和Hcatalog集成无法使用MySql MetaStore

0 个答案: