我正在尝试将数据框加载到Hive表中。
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.SaveMode
import org.apache.spark.sql._
object SparkToHive {
def main(args: Array[String]) {
val warehouseLocation = "file:${system:user.dir}/spark-warehouse"
val sparkSession = SparkSession.builder.master("local[2]").appName("Saving data into HiveTable using Spark")
.enableHiveSupport()
.config("hive.exec.dynamic.partition", "true")
.config("hive.exec.dynamic.partition.mode", "nonstrict")
.config("hive.metastore.warehouse.dir", "/user/hive/warehouse")
.config("spark.sql.warehouse.dir", warehouseLocation)
.getOrCreate()
**import sparkSession.implicits._**
val partfile = sparkSession.read.text("partfile").as[String]
val partdata = partfile.map(part => part.split(","))
case class Partclass(id:Int, name:String, salary:Int, dept:String, location:String)
val partRDD = partdata.map(line => PartClass(line(0).toInt, line(1), line(2).toInt, line(3), line(4)))
val partDF = partRDD.toDF()
partDF.write.mode(SaveMode.Append).insertInto("parttab")
}
}
我还没有执行它但是我在这一行收到了以下错误:
import sparkSession.implicits._
could not find implicit value for evidence parameter of type org.apache.spark.sql.Encoder[String]
我该如何解决这个问题?
答案 0 :(得分:4)
请将case
class
Partclass
移到SparkToHive
对象之外。它应该没问题
您**
声明中有implicits import
。试试
import sparkSession.sqlContext.implicits._
答案 1 :(得分:2)
我犯的错误是
案例类应位于主要内部和对象内部
在这一行:val partfile = sparkSession.read.text("partfile").as[String]
中,我使用read.text("..")
将文件导入Spark,我们可以使用read.textFile("...")