无法使用es-hadoop在elasticsearch中创建外部表

时间:2017-01-25 07:42:45

标签: scala apache-spark hive elasticsearch-hadoop bigdata

我正在运行一个简单的火花提交工作,例如:

enter code here spark-submit --class com.x.y.z.logan 
 /home/test/spark/sample.jar

  table in jar file

   hiveContext.sql("CREATE TABLE IF NOT EXISTS 
   databasename.tablename(es_column_name STRING) STORED BY 
   'org.elasticsearch.hadoop.hive.EsStorageHandler' 
    TBLPROPERTIES('es.resource' = 'index_name/log','es.mapping.names' 
    ='tablecoulmname :es_column_name ', 'es.nodes' = 
     '192.168.x.1y:9200','es.input.json' = 'false', 
      'es.index.read.missing.as.empty' = 'yes' ,'es.index.auto.create' = 
     'yes') ")

      hiveContext.sql("INSERT INTO TABLE test.incompleterun SELECT 
      s.streamname FROM incomplete s");



       **ERROR**
      client token: N/A
       diagnostics: User class threw exception: java.lang.RuntimeException: 
      java.lang.RuntimeException: class 
      org.elasticsearch.hadoop.mr.EsOutputFormat$EsOutputCommitter not 
      org.apache.hadoop.mapred.OutputCommitter
      ApplicationMaster host: 192.168.x.y
      ApplicationMaster RPC port: 0
      queue: root.users.test
      start time: 1485286033939
      final status: FAILED
      tracking URL: some URL
      user: test
       Exception in thread "main" org.apache.spark.SparkException: 
      Application application_1485258812942_0008 finished with failed status
      at org.apache.spark.deploy.yarn.Client.run(Client.scala:1035)
      at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1082)
      at org.apache.spark.deploy.yarn.Client.main(Client.scala)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

当我们在Hive中创建外部表时,ES-hadoop工作正常 接口并从hive表中将数据加载到外部表中。它 当我们在jar中包含相同的查询时,它不起作用。同一个罐子 在我们创建普通的配置单元表时,文件工作正常。问题 当我们在jar中包含外部表时,这里显示以下错误 文件 任何人都可以帮我解决这个问题吗?

0 个答案:

没有答案