无法将文件从ADLS(azure Datalake)加载到Hive表

时间:2017-07-05 06:12:47

标签: hadoop apache-spark apache-spark-sql hiveql

当我想使用下面的命令从我的azure datalake存储器加载文件到Hive表时,

hiveContext.sql(LOAD DATA INPATH 'adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv' OVERWRITE INTO TABLE sample.test03)

我收到错误:ApplicationMaster:User类抛出异常:java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

整个错误日志:

17/07/05 05:45:48 INFO SparkSqlParser: Parsing command: CREATE TABLE IF NOT EXISTS sample.test03 ( GEO STRING,UPC STRING,WeekEnding STRING,BaseDollars INT,BaseDollars_AnyPromo INT,BaseDollars_Display INT,BaseDollars_FeatAndDisp INT,BaseDollars_FeatAndOrDisp INT,BaseDollars_Feature INT,BaseDollars_NoPromo INT,BaseDollars_TPR INT,BaseUnits INT,BaseUnits_AnyPromo INT,BaseUnits_Display INT,BaseUnits_EQ STRING,BaseUnits_EQ_AnyPromo STRING,BaseUnits_EQ_Display STRING,BaseUnits_EQ_FeatAndDisp STRING,BaseUnits_EQ_FeatAndOrDisp STRING,BaseUnits_EQ_Feature STRING,BaseUnits_EQ_NoPromo STRING,BaseUnits_EQ_TPR STRING,BaseUnits_FeatAndDisp INT,BaseUnits_FeatAndOrDisp INT,BaseUnits_Feature INT,BaseUnits_NoPromo INT,BaseUnits_TPR INT,Dollars INT,Dollars_AnyPromo INT,Dollars_Display INT,Dollars_FeatAndDisp INT,Dollars_FeatAndOrDisp INT,Dollars_Feature INT,Dollars_NoPromo INT,Dollars_TPR INT,PACV_Discount INT,PACV_DispWOFeat INT,PACV_FeatAndDisp INT,PACV_FeatWODisp INT,Units INT,Units_AnyPromo INT,Units_Display INT,Units_EQ INT,Units_EQ_AnyPromo STRING,Units_EQ_Display STRING,Units_EQ_FeatAndDisp STRING,Units_EQ_FeatAndOrDisp STRING,Units_EQ_Feature STRING,Units_EQ_NoPromo STRING,Units_EQ_TPR STRING,Units_FeatAndDisp INT,Units_FeatAndOrDisp INT,Units_Feature INT,Units_NoPromo INT,Units_TPR INT,ACV INT )  ROW FORMAT DELIMITED  FIELDS TERMINATED BY ',' STORED AS TEXTFILE
17/07/05 05:45:49 INFO SparkSqlParser: Parsing command: LOAD DATA INPATH 'adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv' OVERWRITE INTO TABLE sample.test03
17/07/05 05:45:49 INFO SessionState: Could not get hdfsEncryptionShim, it is only applicable to hdfs filesystem.
17/07/05 05:45:49 INFO Hive: Replacing src:adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv, dest: wasb://bieno-da-d-56872-unilevercom-hdi-01@049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv, Status:false
17/07/05 05:45:49 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.sql.hive.client.Shim_v0_14.loadTable(HiveShim.scala:633)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply$mcV$sp(HiveClientImpl.scala:646)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:646)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:646)
    at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:280)
    at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227)
    at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226)
    at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:269)
    at org.apache.spark.sql.hive.client.HiveClientImpl.loadTable(HiveClientImpl.scala:645)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply$mcV$sp(HiveExternalCatalog.scala:248)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:246)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:246)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:72)
    at org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:246)
    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:297)
    at org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:335)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
    at com.accenture.Unilever.Nielsen.RestatementSample.Restatement(RestatementSample.scala:70)
    at com.accenture.Unilever.StageToEnrich.RestatementLogic$.main(RestatementLogic.scala:36)
    at com.accenture.Unilever.StageToEnrich.RestatementLogic.main(RestatementLogic.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv into: wasb://bieno-da-d-56872-unilevercom-hdi-01@049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv
    at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2919)
    at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1640)
    ... 44 more
Caused by: java.io.IOException: Error moving: adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv into: wasb://bieno-da-d-56872-unilevercom-hdi-01@049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv
    at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2913)
    ... 45 more

我可以从HIVE shell执行相同的代码,但是从spark脚本我得到了这个错误。我需要包含任何特殊的Jar文件吗?任何帮助将不胜感激。

0 个答案:

没有答案