我正在尝试找出使用代理来使用消息队列(Amazon SQS)中的项目的最佳方法。现在我有一个函数(process-queue-item)从队列中抓取一个项目并对其进行处理。
我想同时处理这些项目,但我无法理解如何控制代理。基本上我想尽可能地让所有代理忙,而不是从队列中抽取很多项目并开发积压(我会在几台机器上运行它,所以项目需要留在队列中直到它们真的需要)。
有人能给我一些关于改进实施的指示吗?
(def active-agents (ref 0))
(defn process-queue-item [_]
(dosync (alter active-agents inc))
;retrieve item from Message Queue (Amazon SQS) and process
(dosync (alter active-agents dec)))
(defn -main []
(def agents (for [x (range 20)] (agent x)))
(loop [loop-count 0]
(if (< @active-agents 20)
(doseq [agent agents]
(if (agent-errors agent)
(clear-agent-errors agent))
;should skip this agent until later if it is still busy processing (not sure how)
(send-off agent process-queue-item)))
;(apply await-for (* 10 1000) agents)
(Thread/sleep 10000)
(logging/info (str "ACTIVE AGENTS " @active-agents))
(if (> 10 loop-count)
(do (logging/info (str "done, let's cleanup " count))
(doseq [agent agents]
(if (agent-errors agent)
(clear-agent-errors agent)))
(apply await agents)
(shutdown-agents))
(recur (inc count)))))
答案 0 :(得分:23)
(let [switch (atom true) ; a switch to stop workers
workers (doall
(repeatedly 20 ; 20 workers pulling and processing items from SQS
#(future (while @switch
(retrieve item from Amazon SQS and process)))))]
(Thread/sleep 100000) ; arbitrary rule to decide when to stop ;-)
(reset! switch false) ; stop !
(doseq [worker workers] @worker)) ; waiting for all workers to be done
答案 1 :(得分:6)
你要求的是一种继续分发任务的方法,但有一些上限。一种简单的方法是使用信号量来协调限制。以下是我将如何处理它:
(let [limit (.availableProcessors (Runtime/getRuntime))
; note: you might choose limit 20 based upon your problem description
sem (java.util.concurrent.Semaphore. limit)]
(defn submit-future-call
"Takes a function of no args and yields a future object that will
invoke the function in another thread, and will cache the result and
return it on all subsequent calls to deref/@. If the computation has
not yet finished, calls to deref/@ will block.
If n futures have already been submitted, then submit-future blocks
until the completion of another future, where n is the number of
available processors."
[#^Callable task]
; take a slot (or block until a slot is free)
(.acquire sem)
(try
; create a future that will free a slot on completion
(future (try (task) (finally (.release sem))))
(catch java.util.concurrent.RejectedExecutionException e
; no task was actually submitted
(.release sem)
(throw e)))))
(defmacro submit-future
"Takes a body of expressions and yields a future object that will
invoke the body in another thread, and will cache the result and
return it on all subsequent calls to deref/@. If the computation has
not yet finished, calls to deref/@ will block.
If n futures have already been submitted, then submit-future blocks
until the completion of another future, where n is the number of
available processors."
[& body] `(submit-future-call (fn [] ~@body)))
#_(example
user=> (submit-future (reduce + (range 100000000)))
#<core$future_call$reify__5782@6c69d02b: :pending>
user=> (submit-future (reduce + (range 100000000)))
#<core$future_call$reify__5782@38827968: :pending>
user=> (submit-future (reduce + (range 100000000)))
;; blocks at this point for a 2 processor PC until the previous
;; two futures complete
#<core$future_call$reify__5782@214c4ac9: :pending>
;; then submits the job
现在,只需要协调任务本身的处理方式。听起来你已经有了这样做的机制。循环(submit-future(process-queue-item))
答案 2 :(得分:4)
也许您可以使用seque
功能?引用(doc seque)
:
clojure.core/seque
([s] [n-or-q s])
Creates a queued seq on another (presumably lazy) seq s. The queued
seq will produce a concrete seq in the background, and can get up to
n items ahead of the consumer. n-or-q can be an integer n buffer
size, or an instance of java.util.concurrent BlockingQueue. Note
that reading from a seque can block if the reader gets ahead of the
producer.
我想到的是通过网络获取队列项的懒惰序列;你将它包装在seque
中,将它放在一个Ref中并让工作者代理使用这个seque
之外的项目。 seque
从代码的角度返回看起来像常规seq的东西,队列魔法以透明的方式发生。请注意,如果您放入的序列被分块,那么它仍然会被强制一次。还要注意,对seque
本身的初始调用似乎会阻塞,直到获得一个或两个初始项(或者一个块,视情况而定;我认为这与懒惰序列的工作方式有关,而不是{{ 1}}本身,但是)。
代码草图(真正粗略的,未经过测试):
seque
实际上你可能想要一个更复杂的队列项seq生产者,这样你就可以要求它停止生产新项目(如果整个事情能够优雅地关闭的话,这是必要的;未来将会死亡当任务源干涸时,使用(defn get-queue-items-seq []
(lazy-seq
(cons (get-queue-item)
(get-queue-items-seq))))
(def task-source (ref (seque (get-queue-items-seq))))
(defn do-stuff []
(let [worker (agent nil)]
(if-let [result
(dosync
(when-let [task (first @task-source)]
(send worker (fn [_] (do-stuff-with task)))))]
(do (await worker)
;; maybe do something with worker's state
(do-stuff))))) ;; continue working
(defn do-lots-of-stuff []
(let [fs (doall (repeatedly 20 #(future (do-stuff))))]
fs)))
查看他们是否已经这样做了。而这正是我第一眼就能看到的东西......我相信这里还有更多东西需要改进。不过,我认为一般方法都可行。
答案 3 :(得分:0)
不确定这是多么惯用,因为我还是该语言的新手,但以下解决方案对我有用:
(let [number-of-messages-per-time 2
await-timeout 1000]
(doseq [p-messages (partition number-of-messages-per-time messages)]
(let [agents (map agent p-messages)]
(doseq [a agents] (send-off a process))
(apply await-for await-timeout agents)
(map deref agents))))