KafkaUtils API |抵消管理| Spark Streaming

时间:2016-09-12 13:19:32

标签: scala apache-spark apache-kafka spark-streaming

我正在尝试用一次语义来管理kafka偏移量。

使用偏移图创建直接流时遇到问题,如下所示:

val fromOffsets : (TopicAndPartition, Long) = TopicAndPartition(metrics_rs.getString(1), metrics_rs.getInt(2)) -> metrics_rs.getLong(3)

KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder,(String, String)] (ssc,kafkaParams,fromOffsets,messageHandler)

这里,

val messageHandler =
      (mmd: MessageAndMetadata[String, String]) => mmd.message.length

并且

metrics_rs = metricsStatement.executeQuery("SELECT part,off from metrics.txn_offsets where topic='"+t+''' )

我想我的声明风格做错了......如果你能提供帮助的话。 编译错误说" createDirectStream"

的类型参数太多

1 个答案:

答案 0 :(得分:2)

我发现你做错了几件事。

您需要传递Map[TopicAndPartition, Long],而目前您有Tuple2[TopicAndPartition, Long]。所以你需要:

val fromOffsets: Map[TopicAndPartition, Long] = 
    Map(TopicAndPartition(metrics_rs.getString(1), 
                          metrics_rs.getInt(2)) -> metrics_rs.getLong(3))

您说createDirectStream的返回类型是(String, String)类型的元组,但您的messageHandler值是Int。如果要返回具有键值对的元组,则需要:

val messageHandler: MessageAndMetadata[String, String] => (String, String) =
  (mmd: MessageAndMetadata[String, String]) => (mmd.key(), mmd.message())

修复后,应该编译:

val stream = KafkaUtils
              .createDirectStream[String, String,
                      StringDecoder, StringDecoder,
                      (String, String)] (ssc, 
                                         kafkaParams, 
                                         fromOffsets, 
                                         messageHandler)