我使用以下DDL在Hive中创建了ORC Bucketed表:
create table Employee( EmpID STRING , EmpName STRING)
clustered by (EmpID) into 10 buckets
stored as orc
TBLPROPERTIES('transactional'='true');
然后运行Sqoop Import:
sqoop import --verbose \
--connect 'RDBMS_JDBC_URL' \
--driver JDBC_DRIVER \
--table Employee \
--null-string '\\N' \
--null-non-string '\\N' \
--username USER \
--password PASSWPRD \
--hcatalog-database hive_test_trans \
--hcatalog-table Employee \
--hcatalog-storage-stanza \
"storedas orc" -m 1
哪个失败,出现以下异常:
22/12/17 03:28:59 ERROR
tool.ImportTool: Encountered IOException running import job:
org.apache.hive.hcatalog.common.HCatException : 2016 : **Error
operation not supported : Store into a partition with bucket
definition from Pig/Mapreduce is not supported**
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:109)
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:339)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:753)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:240)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
我们可以通过创建临时表来解决这个问题,但我不想再添加一个步骤。
我可以直接将数据从Oracle导入到ORC Bucketed表而无需使用临时表吗?
答案 0 :(得分:0)
Hive仍然不支持将数据导入事务性Hive表,您必须有一个解决方法。
Here is the link用于获取修复的开放式JIRA票证。在此之前,您必须执行一些中间操作才能将数据写入Hive。您在问题中提到的临时表选项是一个很好的选择。