我有以下代码: -
val kafkaStream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics)
val records = kafkaStream.map(_._2)
records.foreachRDD { rdd =>
{
rdd.collect().foreach(a =>
{
implicit val formats = DefaultFormats
import sqlContext.implicits._
val jValue = parse(a)
val record = jValue.extract[historyevent]
println("JSON String" + a.toString())
val history_gpsdt = record.gpsdt //"2017-04-12 00:25:10"
val history_latitude = record.latitude //6.854678
val history_longitude = record.longitude //78.583751
val rss = sc.cassandraTable("db", "table").select("imei", "date", "gpsdt").where("imei=? and date=?", record.imei, record.date)
// some more set of statements
}
}
}
所以,这里我从Kafka Streaming获取了一组JSON字符串,我从中逐一提取每一行,然后对数据执行一些逻辑。因此,我只是想了解我是使用最佳方式还是我们可以更优化它。请建议。谢谢,