我按如下方式定义类:
class AbnormalSim2(Cust_Id:String,Trx_Dt:String,Cur_Bns_Bal:String){
@BeanProperty var Cust_Id1:String = _
@BeanProperty var Trx_Dt1:String = _
@BeanProperty var Cur_Bns_Bal1:BigDecimal = _
}
然后,
import spark.implicits._
implicit val mapEncode1r = org.apache.spark.sql.Encoders.kryo[AbnormalSim2]
abnormal.select("Cust_Id","Trx_Dt","Cur_Bns_Bal").as[AbnormalSim2].show
出现错误
org.apache.spark.sql.AnalysisException:尝试将struct映射到Tuple1,但由于字段数不对齐而失败。
然后我按如下方式定义类:
case class AbnormalSim(){
@BeanProperty var Cust_Id:String = _
@BeanProperty var Trx_Dt:String = _
@BeanProperty var Cur_Bns_Bal:BigDecimal = _
}
import spark.implicits._
abnormal.select("Cust_Id","Trx_Dt","Cur_Bns_Bal").as[AbnormalSim].show
成功
我的问题是当我们使用.as [T]将数据帧转换为数据集时T必须是“case class”,Must T是“case class”吗?