我用弹性搜索库编写Spark程序。
这是我的build.sbt。
scalaVersion := "2.10.5"
val sparkVersion = "2.0.1"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-catalyst" % sparkVersion % "provided",
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
)
libraryDependencies += "org.elasticsearch" % "elasticsearch-spark_2.10" % "2.3.3"
libraryDependencies += "org.elasticsearch" % "elasticsearch" % "2.3.3"
这是错误消息。
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.sql(Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
at com.minsu.house.BatchProgram.process(BatchProgram.scala:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我的代码如下......
val sqlContext = new SQLContext(sparkContext)
val dataframe = sqlContext.sql(sqlString) // <----- HERE !!!
我认为它与弹性搜索库没有任何关系。
它似乎只是由于依赖或版本问题。
我该怎么办?帮帮我..谢谢你。
答案 0 :(得分:1)
您尝试使用的Elasticsearch Spark连接器版本不支持Spark 2.您有两个选项:
例如,我使用了org.elasticsearch:elasticsearch-spark-20_2.11:5.0以及以下Spark 2代码:
// add to your class imports
import org.elasticsearch.spark.sql._
// Use Spark 2.0 SparkSession object to provide your config
val sparkSession = SparkSession.builder().config(...).getOrCreate()
// Optional step, imports things like $"column"
import sparkSession.implicits._
// Specify your index and type in ES
val df = spark.esDF("index/type")
// Perform an action
df.count()