从kafka字符串解析avro架构

时间:2015-12-22 12:01:07

标签: parsing deserialization apache-kafka spark-streaming avro

如何在apache spark中解析从kafka编码为字符串的avro架构? 我正在使用apache spark streaming。我已经以avro格式存储了我的点击流,我使用了divolte收集器来获得点击。此外,我正在使用kafka将点击流实时转换为火花流。现在我想将这个avro字符串反序列化为一个模式,供spark进一步使用。

我使用了scala case class和genseler.scalavro库来实现它但是没有成功。以下是代码

case class Change(detectedDuplicate : Boolean,
         detectedCorruption: Boolean,
         firstInSession: Boolean,
         timestamp: Long,
         remoteHost: String,
         referer: String,
         location: String,
         viewportPixelWidth: Int,
         viewportPixelHeight: Int,
         screenPixelWidth: Int,
         screenPixelHeight: Int,
         partyId: String,
         sessionId: String,
         pageViewId: String,
         eventType: String,
         userAgentString: String,
         userAgentName: String,
         userAgentFamily: String,
         userAgentVendor: String,
         userAgentType: String,
         userAgentVersion: String,
         userAgentDeviceCategory: String,
         userAgentOsFamily: String,
         userAgentOsVersion: String,
         userAgentOsVendor: String)

object kafkaParser{

   def parse(event: String): Change = {
      val m = AvroType[Change]
      return new Change(m.schema)// gives me a error at this point unspecified value parameters

     }
 }

m.schema为我提供了avro文件的架构。

http://genslerappspod.github.io/scalavro/

请帮我解决这个问题。

0 个答案:

没有答案