Sqoop无法通过java.IO异常

时间:2017-03-30 15:36:40

标签: hadoop teradata sqoop

这是我用来从Teradata中提取数据的sqoop导入

     sqoop import -libjars jars --driver drivers --connect connection_url -m 1 --hive-overwrite --hive-import --hive-database hivedatabase --hive-table hivetable --target-dir '/user/hive/warehouse/database.db/table_name' --as-parquetfile --query "select c1,c2,c3, to_char(SOURCE_ACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS') as SOURCE_ACTIVATION_DT,to_char(SOURCE_DEACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS') as SOURCE_DEACTIVATION_DT,to_char(EFF_DT,'YYYY-MM-DD HH24:MI:SS') as EFF_DT,to_char(EXP_DT,'YYYY-MM-DD HH24:MI:SS') as EXP_DT,to_char(SYS_UPDATE_DTM,'YYYY-MM-DD HH24:MI:SS') as SYS_UPDATE_DTM,to_char(SYS_LOAD_DTM,'YYYY-MM-DD HH24:MI:SS') as SYS_LOAD_DTM from source_schema.table_name WHERE to_char(SYS_UPDATE_DTM,'YYYY-MM-DD HH24:MI:SS')> '2017-03-30 10:00:00' OR to_char(SYS_LOAD_DTM,'YYYY-MM-DD HH24:MI:SS') > '2017-03-30 10:00:00' AND \$CONDITIONS"

以下是我收到的错误,这两天运行正常并且最近开始返回以下错误。

17/03/29 20:07:53 INFO mapreduce.Job:  map 0% reduce 0%
17/03/29 20:56:46 INFO mapreduce.Job: Task Id : attempt_1487033963691_263120_m_000000_0, Status : FAILED
Error: java.io.IOException: SQLException in nextKeyValue
    at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
    at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: [Teradata JDBC Driver] [TeraJDBC 15.10.00.14] [Error 1005] [SQLState HY000] Unexpected parcel kind received: 9
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:94)
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDriverJDBCException(ErrorFactory.java:69)
    at com.teradata.jdbc.jdbc_4.statemachine.ReceiveRecordSubState.action(ReceiveRecordSubState.java:195)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.subStateMachine(StatementReceiveState.java:311)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.action(StatementReceiveState.java:200)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:137)
    at com.teradata.jdbc.jdbc_4.statemachine.PreparedStatementController.run(PreparedStatementController.java:46)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementController.fetchRows(StatementController.java:360)
    at com.teradata.jdbc.jdbc_4.TDResultSet.goToRow(TDResultSet.java:374)
    at com.teradata.jdbc.jdbc_4.TDResultSet.next(TDResultSet.java:657)
    at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:237)
    ... 12 more

当我用谷歌搜索时,我看到人们因不同的错误而得到相同的错误,我知道这与我在where子句中使用的时间有关,但不确定我究竟需要改变什么。

提前致谢...... !!

1 个答案:

答案 0 :(得分:1)

Sqoop使用$CONDITIONS来获取元数据和数据。

  • 元数据 - 用 1 = 0 替换$CONDITIONS。因此,使用此条件不会获取任何数据,只会获取元数据。

  • 1个映射器的数据:它用 1 = 1 替换$CONDITIONS。因此,所有数据都被提取。

  • 多个映射器时的数据:它会替换$CONDITIONS一些范围条件。

在JDBC客户端中尝试这些查询:

  •   

    选择c1,c2,c3,to_char(SOURCE_ACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS')作为SOURCE_ACTIVATION_DT,to_char(SOURCE_DEACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS')作为SOURCE_DEACTIVATION_DT, TO_CHAR(EFF_DT, 'YYYY-MM-DD HH24:MI:SS')作为EFF_DT,TO_CHAR(EXP_DT, 'YYYY-MM-DD HH24:MI:SS')作为EXP_DT,TO_CHAR(SYS_UPDATE_DTM,'YYYY-MM-DD HH24:MI:SS ')作为SYS_UPDATE_DTM,TO_CHAR(SYS_LOAD_DTM,' YYYY-MM-DD HH24:MI:SS '),如从source_schema.table_name SYS_LOAD_DTM WHERE TO_CHAR(SYS_UPDATE_DTM,' YYYY-MM-DD HH24:MI:SS” )> '2017-03-30 10:00:00'或to_char(SYS_LOAD_DTM,'YYYY-MM-DD HH24:MI:SS')> '2017-03-30 10:00:00'和1 = 0“

  •   

    选择c1,c2,c3,to_char(SOURCE_ACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS')作为SOURCE_ACTIVATION_DT,to_char(SOURCE_DEACTIVATION_DT,'YYYY-MM-DD HH24:MI:SS')作为SOURCE_DEACTIVATION_DT, TO_CHAR(EFF_DT, 'YYYY-MM-DD HH24:MI:SS')作为EFF_DT,TO_CHAR(EXP_DT, 'YYYY-MM-DD HH24:MI:SS')作为EXP_DT,TO_CHAR(SYS_UPDATE_DTM,'YYYY-MM-DD HH24:MI:SS ')作为SYS_UPDATE_DTM,TO_CHAR(SYS_LOAD_DTM,' YYYY-MM-DD HH24:MI:SS '),如从source_schema.table_name SYS_LOAD_DTM WHERE TO_CHAR(SYS_UPDATE_DTM,' YYYY-MM-DD HH24:MI:SS” )> '2017-03-30 10:00:00'或to_char(SYS_LOAD_DTM,'YYYY-MM-DD HH24:MI:SS')> '2017-03-30 10:00:00'和1 = 1“

如果这些不起作用,带有此查询的sqoop命令永远不会运行。