在一个ruby应用程序中,我有一堆没有共享状态的任务,我想一次启动多个任务。至关重要的是,我并不关心它们的启动顺序,也不关心它们的返回值(因为它们在完成之前都会产生数据库事务)。我知道根据我的ruby实现,GIL可能会阻止这些任务实际同时运行,但这没关系,因为我实际上并不真正对并发感兴趣:这些工作线程无论如何都将通过网络请求进行IO绑定。
到目前为止我所得到的是this:
def asyncDispatcher(numConcurrent, stateQueue, &workerBlock)
workerThreads = []
while not stateQueue.empty?
while workerThreads.length < numConcurrent
nextState = stateQueue.pop
nextWorker =
Thread.new(nextState) do |st|
workerBlock.call(st)
end
workerThreads.push(nextWorker)
end # inner while
workerThreads.delete_if{|th| not th.alive?} # clean up dead threads
end # outer while
workerThreads.each{|th| th.join} # join any remaining workers
end # asyncDispatcher
我这样调用它:
asyncDispatcher(2, (1..10).to_a ) {|x| x + 1}
这里是否存在任何潜伏的错误或并发陷阱?或者在运行时可能会简化这项任务的东西?
答案 0 :(得分:2)
使用队列:
require 'thread'
def asyncDispatcher(numWorkers, stateArray, &processor)
q = Queue.new
threads = []
(1..numWorkers).each do |worker_id|
threads << Thread.new(processor, worker_id) do |processor, worker_id|
while true
next_state = q.shift #shift() blocks if q is empty, which is the case now
break if next_state == q #Some sentinel that won't appear in your data
processor.call(next_state, worker_id)
end
end
end
stateArray.each {|state| q.push state}
stateArray.each {q.push q} #Some sentinel that won't appear in your data
threads.each(&:join)
end
asyncDispatcher(2, (1..10).to_a) do |state, worker_id|
time = sleep(Random.rand 10) #How long it took to process state
puts "#{state} is finished being processed: worker ##{worker_id} took #{time} secs."
end
--output:--
2 is finished being processed: worker #1 took 4 secs.
3 is finished being processed: worker #1 took 1 secs.
1 is finished being processed: worker #2 took 7 secs.
5 is finished being processed: worker #2 took 1 secs.
6 is finished being processed: worker #2 took 4 secs.
7 is finished being processed: worker #2 took 1 secs.
4 is finished being processed: worker #1 took 8 secs.
8 is finished being processed: worker #2 took 1 secs.
10 is finished being processed: worker #2 took 3 secs.
9 is finished being processed: worker #1 took 9 secs.
好吧,好吧,有人会看着那个输出而哭出来,
嘿,#2总共需要13秒才能连续完成4个工作,而#1 只花了8秒钟。对于一份工作,所以#1的输出为8秒。工作应该 早点来了Ruby中没有线程切换! Ruby是 碎了!”。
嗯,虽然#1正在睡觉前两个工作总共5秒钟,但#2正在同时睡觉,所以当#1完成它的前两个工作时#2只剩下2秒钟才能睡觉。因此,将#2的7秒替换为2秒,你会看到在#1号完成前两个工作后,#2连续四个工作连续运行总共需要8秒,其中#1为1 8秒的工作。