Sqoop导出HDFS到MySQL失败

时间:2017-12-14 06:57:47

标签: mysql hadoop sqoop

我不知道我哪里出错了,但是每次从HDFS到MySQL的Sqoop导出命令都会失败。

sqoop export --connect "jdbc:mysql://quickstart.cloudera:3306/streaming" 
--username root --password cloudera --table pd_count --update-key id 
--update-mode allowinsert  --export-dir /user/cloudera/input/* -m 1 --batch

导出目录中只有一个文件夹,它包含3个文件,即

  1. 部分-M-00000
  2. 部分-M-00001
  3. 部分-M-00002
  4. 我已更新上一个文件,以便了解--update参数。但是,无论我尝试多少排列,工作都会失败。

    1. 我将数据导出到MySQL而没有增量更新,数据导出成功。
    2. 我使用“增量附加”将数据导入HDFS,这是成功的。
    3. 但是当我尝试使用“更新密钥”和“更新模式”将数据导出到MySQL时,它不会传输并失败。
    4. 上面提到的命令是最后使用过的命令。

      以下是此link的最新错误日志, 请帮帮我。

      提前致谢。

1 个答案:

答案 0 :(得分:0)

Ok..I was assuming something different.Could you try using the below options

  1. Use --verbose in the export once again for extended logs.
  2. You can look at the application logs from the failed application. To fetch them run the following command as the user who ran the Sqoop command-yarn logs -applicationId application_1513399439223_0001 > app_logs.txt.
  3. It seems you didnt add --input-fields-terminated-by.

Updating the Answer as per your latest comment

I see you have killed the job.It might be related to performance.Please try tuning the below and run the sqoop again:

  • Set the number of mappers to 4 -m 4
  • Insert the data in batches --batch
  • Use the property sqoop.export.records.per.statement to specify the number of records that will be used in each insert statement sqoop export -Dsqoop.export.records.per.statement=100 --connect
  • Finally,specify how many rows will be inserted per transaction with the sqoop.export.statements.per.transaction property. sqoop export -Dsqoop.export.statements.per.transaction=100 --connect

Please provide the yarn logs and what is the volume of data?