在将数据从Hive转储到MongoDB时,我遇到了以下问题。我运行的命令是:
1)
创建由" org.yong3.hive.mongo.MongoStorageHandler"存储的外部表mongo_users(memberid字符串,电子邮件字符串,sentdate字符串,actiontype字符串,actiondate字符串,campaignid字符串,campaignname字符串)。使用serdeproperties(" mongo.column.mapping" =" memberid,email,sentdate,actiontype,actiondate,campaignid,campaignname")tblproperties(" mongo.host" = " serverip"," mongo.port" =" port"," mongo.db" =" admin",& #34; mongo.collection" =" dummy");
2)
insert into table mongo_users select * from testmail;
表格说明:
Mongo_Users
memberid string from deserializer
email string from deserializer
sentdate string from deserializer
actiontype string from deserializer
actiondate string from deserializer
campaignid string from deserializer
campaignname string from deserializer
TestMail表:
memberid string
email string
sentdate string
actiontype string
actiondate string
campaignid string
campaignname string
错误Hive抛出:
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"memberid":"1","email":"George1@gmail.com","sentdate":"1st June 2012","actiontype":"Bounced","actiondate":"4-Jun","campaignid":"51674","campaignname":"Brand Awareness"}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"memberid":"1","email":"George1@gmail.com","sentdate":
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
答案 0 :(得分:0)
首先你必须设置HIVE辅助路径
命令如下。
hive --auxpath /home/hadoop/mongo-java-driver-2.12.4.jar,/home/hadoop/hive-mongo-0.0.1-SNAPSHOT.jar
除了上面的jar之外,你还需要添加执行hive脚本所需的jar。
创建mongo表
创建由" org.yong3.hive.mongo.MongoStorageHandler"存储的外部表mongo_users(student_id INT,email_id STRING,delivery_status STRING)。使用serdeproperties(" mongo.column.mapping" =" student_id,email_id,delivery_status")tblproperties(" mongo.host" =" ServerName" ," mongo.port" =" port"," mongo.db" =" admin"," mongo.user&#34 ; ="管理员"," mongo.passwd" ="管理员"," mongo.collection" ="测试" );
从hive向mongo插入数据:
插入覆盖表mongo_users从avro_table中选择student_id,email_id,delivery_status;