使用HCatalog问题编写Hive动态分区

时间:2015-09-15 08:25:48

标签: hadoop hive hcatalog

使用动态分区将数据插入Hive表时,我遇到了问题。

我有一个普通列和一个分区列的现有表,我试图将数据插入这些列。我的代码:

// Preparing writer
WriteEntity.Builder builder = new WriteEntity.Builder();
WriteEntity entity = builder.withDatabase(DATABASE_NAME).withTable(TABLE_NAME).withPartition(null).build();
HCatWriter masterHCatWriter = DataTransferFactory.getHCatWriter(entity, CUSTOM_CONFIG);
WriterContext writerContext = masterHCatWriter.prepareWrite();
HCatWriter hCatWriter = DataTransferFactory.getHCatWriter(writerContext);

// Preparing record to be written
List<HCatRecord> hCatRecordsBatch = new ArrayList<HCatRecord>();
HCatRecord hCatRecord = new DefaultHCatRecord(2);
hCatRecord.set(0, "aaa");
hCatRecord.set(1, "bbb");
hCatRecordsBatch.add(hCatRecord);

// Writing record
hCatWriter.write(hCatRecordsBatch.iterator());

但我得到例外:

org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : Failed while writing. Cause : org.apache.hive.hcatalog.common.HCatException : 2010 : Invalid partition values specified : Unable to configure dynamic partitioning for storage handler, mismatch between number of partition values obtained[0] and number of partition values required[1]
at org.apache.hive.hcatalog.data.transfer.impl.HCatOutputFormatWriter.write(HCatOutputFormatWriter.java:112)
at ...private classes...
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hive.hcatalog.common.HCatException : 2010 : Invalid partition values specified : Unable to configure dynamic partitioning for storage handler, mismatch between number of partition values obtained[0] and number of partition values required[1]
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.configureOutputStorageHandler(HCatBaseOutputFormat.java:156)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.configureDynamicStorageHandler(FileRecordWriterContainer.java:264)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:183)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
at org.apache.hive.hcatalog.data.transfer.impl.HCatOutputFormatWriter.write(HCatOutputFormatWriter.java:98)
... 8 more

我浏览了hive库的代码,看起来主节点上调用的方法prepareWrite()得到错误的架构。它仅使用普通列加载模式(缺少分区列),之后,无法检索分区列的插入记录中的值(实际上在异常...number of partition values obtained[0]中表示... )。存在同样的问题SO question但在我的情况下,我无法将列附加到架构,因为它是用prepareWrite()方法打包的。

我使用Cloudera版本5.3.2的库(它意味着Hive版本0.13.1)

我将不胜感激任何帮助。感谢。

0 个答案:

没有答案