我正在将应用程序从Sparkstreaming切换到结构化流媒体。它是一个应用程序,可从kafka主题中读取日志以进行解析并将其保存到cassandra。
C:x\x\\CassandraHelper.scala:425:122: Unable to find encoder for type com.xx.dtl.business.cassandra.ConnectionCassDto. An implicit Encoder[com.xx.dtl.business.cassandra.ConnectionCassDto] is needed to store com.xx.dtl.business.cassandra.ConnectionCassDto instances in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.[error] val successConnectionDS= connectionDS.filter(x => x.libelleOperation.equals(xx_SUCCESSFULL_IDENTIFICATION)).flatMap(connection => mapToDto(connection))
这是我得到错误的地方:
def persisteConnection(connectionDS: Dataset[Connection]): Unit = {
val successConnectionDS= connectionDS.filter(x => x.libelleOperation.equals(ATOS_SUCCESSFULL_IDENTIFICATION)).flatMap(connection => mapToDto(connection))
val faliedConnectionDS = connectionDS.filter(x => x.libelleOperation.equals(ATOS_FAILURE_IDENTIFICATION)).flatMap(connection => mapToDto(connection))
successConnectionDS.saveToCassandra(AppConf.CassandraReferentielValorizationKeySpace, "connexion_reussie", SomeColumns(
"identifiant_web",
"date_connexion",
"code_pays",
"coords",
"city_name",
"region_name",
"isp",
"asn",
"id_personne",
"id_dim_temps",
"ip",
"pays",
"session_id",
"client_media_id",
"brs_session_id"))
faliedConnectionDS.saveToCassandra(AppConf.CassandraReferentielValorizationKeySpace, "connexion_echouee", SomeColumns(
"identifiant_web",
"date_connexion",
"code_pays",
"coords",
"city_name",
"region_name",
"isp",
"asn",
"id_personne",
"id_dim_temps",
"ip",
"pays",
"session_id",
"client_media_id",
"brs_session_id"))
}
def mapToDto(connection: Connection): Option[ConnectionCassDto] = {
Some(new ConnectionCassDto(
connection.id_web,
connection.id_dim_temps,
connection.timestamp,
connection.contact_id,
EmptyStringField,
connection.code_pays,
connection.coords.mkString(", "),
connection.city_name,
connection.region_name,
connection.isp,
connection.asn,
connection.ip,
connection.sessionID,
connection.client_media_id,
connection.brsSessionId))
} 基本上,我用数据集和我从kafka读取的方式更改了所有DSTREAM。 在解析步骤中,我没有做任何更改。像处理DSTREAM一样处理数据集。
有线索吗?