将数据从Hive转储到MongoDB时出错

时间:2014-10-30 11:47:37

标签: mongodb hadoop hive hue

在将数据从Hive转储到MongoDB时,我遇到了以下问题。我运行的命令是:

1)

  

创建由" org.yong3.hive.mongo.MongoStorageHandler"存储的外部表mongo_users(memberid字符串,电子邮件字符串,sentdate字符串,actiontype字符串,actiondate字符串,campaignid字符串,campaignname字符串)。使用serdeproperties(" mongo.column.mapping" =" memberid,email,sentdate,actiontype,actiondate,campaignid,campaignname")tblproperties(" mongo.host" = " serverip"," mongo.port" =" port"," mongo.db" =" admin",& #34; mongo.collection" =" dummy");

2)
insert into table mongo_users select * from testmail;

表格说明:

Mongo_Users

memberid        string  from deserializer
email           string  from deserializer
sentdate        string  from deserializer
actiontype      string  from deserializer
actiondate      string  from deserializer
campaignid      string  from deserializer
campaignname    string  from deserializer

TestMail表:

memberid        string
email   string
sentdate        string
actiontype      string
actiondate      string
campaignid      string
campaignname    string

错误Hive抛出:

Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"memberid":"1","email":"George1@gmail.com","sentdate":"1st June 2012","actiontype":"Bounced","actiondate":"4-Jun","campaignid":"51674","campaignname":"Brand Awareness"}
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"memberid":"1","email":"George1@gmail.com","sentdate":

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

1 个答案:

答案 0 :(得分:0)

首先你必须设置HIVE辅助路径

命令如下。

hive --auxpath /home/hadoop/mongo-java-driver-2.12.4.jar,/home/hadoop/hive-mongo-0.0.1-SNAPSHOT.jar

除了上面的jar之外,你还需要添加执行hive脚本所需的jar。

创建mongo表

创建由" org.yong3.hive.mongo.MongoStorageHandler"存储的外部表mongo_users(student_id INT,email_id STRING,delivery_status STRING)。使用serdeproperties(" mongo.column.mapping" =" student_id,email_id,delivery_status")tblproperties(" mongo.host" =" ServerName" ," mongo.port" =" port"," mongo.db" =" admin"," mongo.user&#34 ; ="管理员"," mongo.passwd" ="管理员"," mongo.collection" ="测试" );

从hive向mongo插入数据:

插入覆盖表mongo_users从avro_table中选择student_id,email_id,delivery_status;