我的数据帧架构如下所示,为了创建手动架构,我创建了案例类。
|-- _id: struct (nullable = true)
| |-- oid: string (nullable = true)
|-- message: string (nullable = true)
|-- powerData: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- current: array (nullable = true)
| | | |-- element: double (containsNull = true)
| | |-- delayStartTime: double (nullable = true)
| | |-- idSub1: string (nullable = true)
| | |-- motorNumber: integer (nullable = true)
| | |-- power: array (nullable = true)
| | | |-- element: double (containsNull = true)
我创建了这样的case类,但不确定如何在这个case类中声明StructFields。
case class currentSchema(_id: StructType, message: String, powerData: Array[StructType]
在针对我的DF应用架构时出现此错误。
val dfRef = MongoSpark.load[currentSchema](sparkSessionRef)
Exception in thread "main" scala.MatchError: org.apache.spark.sql.types.StructType (of class scala.reflect.internal.Types$ClassNoArgsTypeRef)
任何人都这样做过吗?寻求帮助。
先谢谢。
答案 0 :(得分:4)
您必须为每个结构创建单独的案例类。
case class idStruct(old: String)
case class pdStruct(current: Array[Double], delayStartTime: Double, idSub1: String, motorNumber: Int, power: Array[Double])
case class currentSchema(_id: idStruct, message: String, powerData: Array[pdStruct])