我正在尝试使用Scala 2.11.12在Spark 2.3.0中定义udf。在我看来,从阅读docs开始,我需要使用SparkSession.udf()
来做到这一点。
但是我无法导入该对象,
import org.apache.spark.sql.SparkSession
导致:
Error:(2, 8) object SparkSession is not a member of package org.apache.spark.sql
import org.apache.spark.sql.SparkSession
这是我的build.sbt:
name := "webtrends-processing-scala"
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.3"
libraryDependencies += "io.lemonlabs" %% "scala-uri" % "1.4.3"
答案 0 :(得分:2)
您必须包括火花sql 依赖项:
libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "2.3.0",
"org.apache.spark" %% "spark-sql" % "2.3.0")