从S3到SQL Server导出拼花文件时,Sqoop导出失败

时间:2019-07-09 18:39:17

标签: sql-server sqoop amazon-emr parquet

我正在尝试使用Sqoop将S3的实木复合地板文件导出到SQL Server,但出现此错误:

  

19/07/09 16:12:57错误sqoop.Sqoop:运行Sqoop时出现异常:org.kitesdk.data.DatasetNotFoundException:未知数据集URI模式:dataset:s3:// mybucket / data-lake / serving- zone / part-00002-b5a1da42.snappy.parquet   检查s3数据集的JAR是否在类路径上   org.kitesdk.data.DatasetNotFoundException:未知数据集URI模式:dataset:s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet   检查s3数据集的JAR是否在类路径上           在org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:128)           在org.kitesdk.data.Datasets.load(Datasets.java:103)           在org.kitesdk.data.Datasets.load(Datasets.java:140)           在org.kitesdk.data.mapreduce.DatasetKeyInputFormat $ ConfigBuilder.readFrom(DatasetKeyInputFormat.java:92)           在org.kitesdk.data.mapreduce.DatasetKeyInputFormat $ ConfigBuilder.readFrom(DatasetKeyInputFormat.java:139)           在org.apache.sqoop.mapreduce.JdbcExportJob.configureInputFormat(JdbcExportJob.java:83)           在org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:434)           在org.apache.sqoop.manager.SQLServerManager.exportTable(SQLServerManager.java:192)           在org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80)           在org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)           在org.apache.sqoop.Sqoop.run(Sqoop.java:147)           在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)           在org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)           在org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)           在org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)           在org.apache.sqoop.Sqoop.main(Sqoop.java:252)

数据集位于上述位置,并且路径URI没问题。我试图从相同的路径导出CSV文件,并且可以正常工作。

以下是我的Sqoop导出命令:

sqoop export --driver com.microsoft.sqlserver.jdbc.SQLServerDriver 
             --connection-manager org.apache.sqoop.manager.SQLServerManager 
             --connect "jdbc:sqlserver://localhost:1433;databaseName=salesdb"  
             --table DimEmployee_test --num-mappers 128 
             --export-dir s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet
             --username db-user --password mypassword

1 个答案:

答案 0 :(得分:0)

您的--connect URI似乎很尴尬,尝试改用这种格式:

jdbc:jtds:sqlserver://<HOST>:<PORT>/<DATABASE>