我编写了以下代码,旨在使用案例类将数据框转换为数据集
def toDs[T](df: DataFrame): Dataset[T] = {
df.as[T]
}
然后case class DATA( name:String, age:Double, location:String)
我得到:
Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
[error] df.as[T]
任何解决方法
答案 0 :(得分:0)
您可以通过以下两种方式将数据读入数据集[MyCaseClass]:
假设您有以下课程:(project is at root, but working)
1)第一种方式:导入范围内的sparksession隐式对象,并使用 as 运算符将您的DataFrame转换为Dataset [MyCaseClass]:
(project is in subfolder as well, but everything working)
2)您可以在另一个对象中创建自己的编码器,然后将其导入当前代码中
case class MyCaseClass
case class MyCaseClass
val spark: SparkSession = SparkSession.builder.enableHiveSupport.getOrCreate()
import spark.implicits._
val ds: Dataset[MyCaseClass]= spark.read.format("FORMAT_HERE").load().as[MyCaseClass]