我编写了一个脚本来执行从oracle表到HDFS目录的数据增量导入。我使用以下sqoop命令进行导入:
sqoop -- import \
--connect $JDBCconnectionString \
--username $dbUserName \
--password-file $passwordLocal \
--query 'select * from dmt_sim.dim_product WHERE $CONDITIONS' \
--split-by "PRODUCT_TITLE" \
--incremental append \
--check-column "KEY" \
--last-value "1" \
--append \
--fields-terminated-by '\t' \
--target-dir /user/ksrinivasan/dmn_product
变量$ JDBCconnectionString,$ dbUserName,$ passwordLocal的值在运行期间被替换,因此与oracle数据库的连接和获取边界值都会成功,但是当作业启动时会抛出错误。
16/01/25 06:19:29 INFO mapreduce.Job: Job job_1452256584707_106782 running in uber mode : false
06:19:36 16/01/25 06:19:29 INFO mapreduce.Job: map 0% reduce 0%
06:20:07 16/01/25 06:20:00 INFO mapreduce.Job: Task Id : attempt_1452256584707_106782_m_000000_0, Status : FAILED
06:20:07 Error: java.io.IOException: SQLException in nextKeyValue
06:20:07 at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
06:20:07 at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
06:20:07 at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
06:20:07 at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
06:20:07 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
06:20:07 at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
06:20:07 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
06:20:07 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
06:20:07 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
06:20:07 at java.security.AccessController.doPrivileged(Native Method)
06:20:07 at javax.security.auth.Subject.doAs(Subject.java:415)
06:20:07 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
06:20:07 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
06:20:07 Caused by: java.sql.SQLSyntaxErrorException: ORA-00907: missing right parenthesis
06:20:07
06:20:07 at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
06:20:07 at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
06:20:07 at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
06:20:07 at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
06:20:07 at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
06:20:07 at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
06:20:07 at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
06:20:07 at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:884)
06:20:07 at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
06:20:07 at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
06:20:07 at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3584)
06:20:07 at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3628)
06:20:07 at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1493)
06:20:07 at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
06:20:07 at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
06:20:07 ... 12 more
06:20:07
如果其他人遇到同样的问题并且他们对如何调试这类问题有一些见解,那就很好了???
答案 0 :(得分:0)
nextKeyValue中的SQLException 这意味着在您的源KEY列中,在执行时不会携带最后一个值“1”。
答案 1 :(得分:0)
检查所有可能的列值是否为空,并特别检查日期列可能包含日期为' 0000-00-00'你正在使用dmt_sim.dim_product的select子句。
标识列并在select中使用条件替换null 或与其他东西约会(你能够识别)。
答案 2 :(得分:0)
这是因为用于拆分的列不是数字(“ PRODUCT_TITLE”)。
当Sqoop翻译时,查询失败的间隔为“ PRODUCT_TITLE”> = xx <= yy
带有数字字段的特征。
执行句子时包括--verbose 2> file_log.log(用于查看拆分句子是否正确)