我正在尝试将数据从HDFS导出到MS SQL,但发生转换数据类型错误。
Sqoop导出命令:
sqoop export --connect "jdbc:sqlserver://..." --username=...
--password=... --hcatalog-database ... --hcatalog-table ... --hcatalog-partition-keys ... --hcatalog-partition-values ... --table ... -- --schema ...
日志:
Log Contents:
2018-10-31 10:04:12,371 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2018-10-31 10:04:12,463 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2018-10-31 10:04:12,463 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2018-10-31 10:04:12,475 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2018-10-31 10:04:12,476 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1539329114857_75069, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@1224144a)
2018-10-31 10:04:12,770 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2018-10-31 10:04:13,085 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoopmetadata/yarn/local/usercache/.../appcache/application_1539329114857_75069
2018-10-31 10:04:13,449 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2018-10-31 10:04:14,101 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2018-10-31 10:04:14,372 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: org.apache.sqoop.mapreduce.hcat.SqoopHCatInputSplit@2ca65ce4
2018-10-31 10:04:14,919 INFO [main] org.apache.hadoop.hive.ql.io.orc.ReaderImpl: Reading ORC rows from hdfs://.../000001_0 with {include: null, offset: 0, length: 1986948}
2018-10-31 10:04:14,993 INFO [main] org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl: Schema on read not provided -- using file schema [kind: STRUCT
subtypes: 1
subtypes: 2
subtypes: 3
subtypes: 4
subtypes: 5
subtypes: 6
subtypes: 7
subtypes: 8
subtypes: 9
subtypes: 10
subtypes: 11
subtypes: 12
subtypes: 13
subtypes: 14
subtypes: 15
subtypes: 16
subtypes: 17
subtypes: 18
subtypes: 19
subtypes: 20
subtypes: 21
subtypes: 22
subtypes: 23
subtypes: 24
subtypes: 25
subtypes: 26
fieldNames: "_col0"
fieldNames: "_col1"
fieldNames: "_col2"
fieldNames: "_col3"
fieldNames: "_col4"
fieldNames: "_col5"
fieldNames: "_col6"
fieldNames: "_col7"
fieldNames: "_col8"
fieldNames: "_col9"
fieldNames: "_col10"
fieldNames: "_col11"
fieldNames: "_col12"
fieldNames: "_col13"
fieldNames: "_col14"
fieldNames: "_col15"
fieldNames: "_col16"
fieldNames: "_col17"
fieldNames: "_col18"
fieldNames: "_col19"
fieldNames: "_col20"
fieldNames: "_col21"
fieldNames: "_col22"
fieldNames: "_col23"
fieldNames: "_col24"
fieldNames: "_col25"
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: LONG
, kind: LONG
, kind: LONG
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: DOUBLE
, kind: STRING
, kind: STRING
, kind: INT
, kind: INT
, kind: STRING
, kind: STRING
, kind: STRING
, kind: STRING
, kind: STRING
, kind: STRING
, kind: STRING
, kind: STRING
]
2018-10-31 10:04:15,080 INFO [main] org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing org.apache.hadoop.hive.ql.io.orc.OrcSerde with properties {transient_lastDdlTime=1524238479, name=..., serialization.null.format=\N, columns=..., serialization.lib=org.apache.hadoop.hive.ql.io.orc.OrcSerde, serialization.format=1, columns.types=double,double,double,double,double,double,bigint,bigint,bigint,double,double,double,double,double,string,string,int,int,string,string,string,string,string,string,string,string, columns.comments=nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull}
2018-10-31 10:04:17,349 WARN [Thread-12] org.apache.sqoop.mapreduce.SQLServerExportDBExecThread: Error executing statement: java.sql.BatchUpdateException: Error converting data type nvarchar to decimal.
2018-10-31 10:04:17,350 WARN [Thread-12] org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread: Trying to recover from DB write failure:
java.sql.BatchUpdateException: Error converting data type nvarchar to decimal.
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1870)
at org.apache.sqoop.mapreduce.SQLServerExportDBExecThread.executeStatement(SQLServerExportDBExecThread.java:96)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.write(SQLServerAsyncDBExecThread.java:272)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.run(SQLServerAsyncDBExecThread.java:240)
2018-10-31 10:04:17,354 WARN [Thread-12] org.apache.sqoop.mapreduce.db.SQLServerConnectionFailureHandler: Cannot handle error with SQL State: S0005
2018-10-31 10:04:17,354 ERROR [Thread-12] org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread: Failed to write records.
java.io.IOException: Registered handler cannot recover error with SQL State: S0005, error code: 8114
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.write(SQLServerAsyncDBExecThread.java:293)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.run(SQLServerAsyncDBExecThread.java:240)
Caused by: java.sql.BatchUpdateException: Error converting data type nvarchar to decimal.
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1870)
at org.apache.sqoop.mapreduce.SQLServerExportDBExecThread.executeStatement(SQLServerExportDBExecThread.java:96)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.write(SQLServerAsyncDBExecThread.java:272)
... 1 more
2018-10-31 10:04:17,354 ERROR [Thread-12] org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread: Got exception in update thread: java.io.IOException: Registered handler cannot recover error with SQL State: S0005, error code: 8114
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.write(SQLServerAsyncDBExecThread.java:293)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.run(SQLServerAsyncDBExecThread.java:240)
Caused by: java.sql.BatchUpdateException: Error converting data type nvarchar to decimal.
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1870)
at org.apache.sqoop.mapreduce.SQLServerExportDBExecThread.executeStatement(SQLServerExportDBExecThread.java:96)
at org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread.write(SQLServerAsyncDBExecThread.java:272)
... 1 more
2018-10-31 10:04:17,568 ERROR [main] org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread: Asynchronous writer thread encountered the following exception: java.io.IOException: Registered handler cannot recover error with SQL State: S0005, error code: 8114
2018-10-31 10:04:17,569 INFO [Thread-13] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
End of LogType:syslog
配置单元表:
CREATE TABLE IF NOT EXISTS HiveTableName (
a DOUBLE,
b DOUBLE,
c DOUBLE,
d DOUBLE,
e DOUBLE,
f DOUBLE,
g BIGINT,
h BIGINT,
i BIGINT,
j DOUBLE,
k DOUBLE,
l DOUBLE,
m DOUBLE,
n DOUBLE,
o STRING,
p STRING,
q INT,
r INT,
s STRING,
t STRING,
u STRING,
v STRING,
w STRING,
x STRING,
y STRING,
z STRING
)
PARTITIONED BY (xyz STRING)
STORED AS ORC;
MSSQL表:
create table MssqlTable (
a decimal,
b decimal,
c decimal,
d decimal,
e decimal,
f decimal,
g bigint,
h bigint,
i bigint,
j decimal,
k decimal,
l decimal,
m decimal,
n decimal,
o varchar(100),
p varchar(100),
q int,
r int,
s varchar(100),
t varchar(100),
u varchar(100),
v varchar(100),
w varchar(100),
x varchar(100),
y varchar(100),
z varchar(100),
xyz varchar(100)
);
我不知道“将数据类型nvarchar转换为十进制时出错”怎么回事。我的桌子上没有nvarchar。我可以告诉sqoop将double转换为十进制,将字符串转换为varchar等吗?