在Clojure中使用Kafka迭代的正确方法是什么?

时间:2014-05-15 02:33:11

标签: clojure apache-kafka

以下代码阻止执行。由于某种原因,地图似乎无法返回延迟序列。我不知道为什么。

(ns testt.consumer
  (:require
    ;internal
    ;external
    [clojure.walk   :refer [stringify-keys]]
    [clojure.pprint :as pprint])
  (:import
    [kafka.consumer         ConsumerConfig Consumer KafkaStream ]
    [kafka.javaapi.consumer ConsumerConnector                   ]
    [kafka.message          MessageAndMetadata                  ]
    [java.util              Properties                          ])
  (:gen-class))

; internal

(defn hashmap-to-properties
  [h]
  (doto (Properties.)
    (.putAll (stringify-keys h))))

; external

(defn consumer-connector
  [h]
  (let [config (ConsumerConfig. (hashmap-to-properties h))]
    (Consumer/createJavaConsumerConnector config)))

(defn message-stream
  [^ConsumerConnector consumer topic thread-pool-size]
  ;this is dealing only with the first stream, needs to be fixed to support multiple streams
  (let [ stream (first (.get (.createMessageStreams consumer {topic thread-pool-size}) topic)) ]
    (map #(.message %) (iterate (.next (.iterator ^KafkaStream stream))))))

连接器期望的配置:

{  :zookeeper.connect              "10.0.0.1:2181"
   :group.id                       "test-0"
   :thread.pool.size               "1"
   :topic                          "test_topic"
   :zookeeper.session.timeout.ms   "1000"
   :zookeeper.sync.time.ms         "200"
   :auto.commit.interval.ms        "1000"
   :auto.offset.reset              "smallest"
   :auto.commit.enable             "true" }

1 个答案:

答案 0 :(得分:2)

实际上不需要Kafka的迭代,因为Clojure提供了一种更好的处理流的方法。您可以将Kafka流视为序列,并执行以下操作:

(doseq 
 [^kafka.message.MessageAndMetadata message stream] (do-some-stuff message))

这可能是最有效的方法。

更多代码在这里:

https://github.com/l1x/shovel