如何使用Spark在ElasticSearch中保存JSON文件?

时间:2018-07-19 13:44:21

标签: scala apache-spark elasticsearch

我正在尝试在ElasticSearch中保存JSON文件,但无法正常工作。

这是我的代码:

import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.elasticsearch.spark.sql._
import org.apache.spark.SparkConf

object HelloEs {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("WriteToES").setMaster("local")
    conf.set("es.index.auto.create", "true")
    val sc = new SparkContext(conf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val sen_p = sqlContext.read.json("/home/Bureau/mydoc/Orange.json")
    sen_p.registerTempTable("sensor_ptable")
    sen_p.saveToEs("sensor/metrics")
  }

}

我也收到此错误:

Exception in thread "main" java.lang.NoSuchMethodError:  org.elasticsearch.spark.sql.package$.sparkDataFrameFunctions(Lorg/apache/spark/sql/Dataset;)Lorg/elasticsearch/spark/sql/package$SparkDataFrameFunctions;
    at learnscala.HelloEs$.main(HelloEs.scala:20)
    at learnscala.HelloEs.main(HelloEs.scala)

1 个答案:

答案 0 :(得分:1)

有多种方法可以将RDD /数据帧保存到Elastic Search。

可以使用以下方法将Spark Dataframe写入Elastic Search:

import org.elasticsearch.spark.rdd.EsSpark EsSpark.saveToEs(rdd, "<ES_RESOURCE_PATH>")

RDD可以使用以下方式写入ES:

object HelloEs { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("WriteToES").setMaster("local") conf.set("es.index.auto.create", "true") val sc = new SparkContext(conf) val sqlContext = new org.apache.spark.sql.SQLContext(sc) val sen_p = sqlContext.read.json("/home/Bureau/mydoc/Orange.json") sen_p.write.format("org.elasticsearch.spark.sql").mode("append").option("es.resource","<ES_RESOURCE_PATH>").option("es.nodes", "http://<ES_HOST>:9200").save() } }

根据您的情况,修改代码如下:

`

package com.test.ttv;
import org.hibernate.boot.Metadata;
import org.hibernate.boot.model.relational.Database;
import org.hibernate.engine.spi.SessionFactoryImplementor;
import org.hibernate.service.spi.SessionFactoryServiceRegistry;

public class MetadataExtractorIntegrator implements org.hibernate.integrator.spi.Integrator {

    public static final MetadataExtractorIntegrator INSTANCE = new MetadataExtractorIntegrator();

    private Database database;

    private Metadata metadata;

    public Database getDatabase() {
        return database;
    }

    public Metadata getMetadata() {
        return metadata;
    }

    @Override
    public void integrate(
            Metadata metadata,
            SessionFactoryImplementor sessionFactory,
            SessionFactoryServiceRegistry serviceRegistry) {

        this.database = metadata.getDatabase();
        this.metadata = metadata;

    }

    @Override
    public void disintegrate(
        SessionFactoryImplementor sessionFactory,
        SessionFactoryServiceRegistry serviceRegistry) {

    }
}

`