如何在数据集中使用java.time.LocalDate(使用java.lang.UnsupportedOperationException失败:找不到编码器)?

时间:2017-07-19 13:59:55

标签: scala apache-spark apache-spark-sql

  • Spark 2.1.1
  • Scala 2.11.8
  • Java 8
  • Linux Ubuntu 16.04 LTS

我想将RDD转换为数据集。为此,我使用 .tooltip-host { display: inline-flex; align-items: center; } 方法implicits给出了以下错误:

toDS()

就我而言,我必须使用Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate - field (class: "java.time.LocalDate", name: "date") - root class: "observatory.TemperatureRow" at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:602) at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:596) at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:587) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:344) 类型,我不能使用java.time.LocalDate。我已经读过,我需要在informe Spark中将transforme Java类型变成Sql类型,我这个方向,我在下面构建了2个implicits函数:

java.sql.data

下面是一些关于我的应用程序的代码:

implicit def toSerialized(t: TemperatureRow): EncodedTemperatureRow = EncodedTemperatureRow(t.date.toString, t.location, t.temperature)
implicit def fromSerialized(t: EncodedTemperatureRow): TemperatureRow = TemperatureRow(LocalDate.parse(t.date), t.location, t.temperature)

我不知道为什么Spark搜索case class Location(lat: Double, lon: Double) case class TemperatureRow( date: LocalDate, location: Location, temperature: Double ) case class EncodedTemperatureRow( date: String, location: Location, temperature: Double val s = Seq[TemperatureRow]( TemperatureRow(LocalDate.parse("2017-01-01"), Location(1.4,5.1), 4.9), TemperatureRow(LocalDate.parse("2014-04-05"), Location(1.5,2.5), 5.5) ) import spark.implicits._ val temps: RDD[TemperatureRow] = sc.parallelize(s) val tempsDS = temps.toDS 的编码器,我为java.time.LocalDateTemperatureRow提供隐式转换......

1 个答案:

答案 0 :(得分:9)

Spark 2.2不支持

java.time.LocalDate(我已经尝试为该类型编写Encoder一段时间并且failed。)

您必须将java.time.LocalDate转换为其他支持的类型,java.sql.Timestampjava.sql.Date是受支持的候选人。