IO中的IO绑定线程

时间:2013-07-06 02:18:16

标签: ruby concurrency worker

在一个ruby应用程序中,我有一堆没有共享状态的任务,我想一次启动多个任务。至关重要的是,我并不关心它们的启动顺序,也不关心它们的返回值(因为它们在完成之前都会产生数据库事务)。我知道根据我的ruby实现,GIL可能会阻止这些任务实际同时运行,但这没关系,因为我实际上并不真正对并发感兴趣:这些工作线程无论如何都将通过网络请求进行IO绑定。

到目前为止我所得到的是this

def asyncDispatcher(numConcurrent, stateQueue, &workerBlock)
  workerThreads = []

  while not stateQueue.empty?
    while workerThreads.length < numConcurrent
      nextState = stateQueue.pop

      nextWorker =
        Thread.new(nextState) do |st|
          workerBlock.call(st)
        end

      workerThreads.push(nextWorker)
    end # inner while

    workerThreads.delete_if{|th| not th.alive?} # clean up dead threads
  end # outer while

  workerThreads.each{|th| th.join} # join any remaining workers
end # asyncDispatcher

我这样调用它:

asyncDispatcher(2, (1..10).to_a ) {|x| x + 1}

这里是否存在任何潜伏的错误或并发陷阱?或者在运行时可能会简化这项任务的东西?

1 个答案:

答案 0 :(得分:2)

使用队列:

require 'thread'

def asyncDispatcher(numWorkers, stateArray, &processor)
  q = Queue.new
  threads = []

  (1..numWorkers).each do |worker_id|
    threads << Thread.new(processor, worker_id) do |processor, worker_id|
      while true
        next_state = q.shift      #shift() blocks if q is empty, which is the case now
        break if next_state == q  #Some sentinel that won't appear in your data
        processor.call(next_state, worker_id)
      end
    end
  end

  stateArray.each {|state| q.push state}
  stateArray.each {q.push q}     #Some sentinel that won't appear in your data

  threads.each(&:join)
end


asyncDispatcher(2, (1..10).to_a) do |state, worker_id|
  time = sleep(Random.rand 10)  #How long it took to process state
  puts "#{state} is finished being processed: worker ##{worker_id} took #{time} secs."
end

--output:--
2 is finished being processed: worker #1 took 4 secs.
3 is finished being processed: worker #1 took 1 secs.
1 is finished being processed: worker #2 took 7 secs.
5 is finished being processed: worker #2 took 1 secs.
6 is finished being processed: worker #2 took 4 secs.
7 is finished being processed: worker #2 took 1 secs.
4 is finished being processed: worker #1 took 8 secs.
8 is finished being processed: worker #2 took 1 secs.
10 is finished being processed: worker #2 took 3 secs.
9 is finished being processed: worker #1 took 9 secs.

好吧,好吧,有人会看着那个输出而哭出来,

  嘿,#2总共需要13秒才能连续完成4个工作,而#1   只花了8秒钟。对于一份工作,所以#1的输出为8秒。工作应该   早点来了Ruby中没有线程切换! Ruby是   碎了!”。

嗯,虽然#1正在睡觉前两个工作总共5秒钟,但#2正在同时睡觉,所以当#1完成它的前两个工作时#2只剩下2秒钟才能睡觉。因此,将#2的7秒替换为2秒,你会看到在#1号完成前两个工作后,#2连续四个工作连续运行总共需要8秒,其中#1为1 8秒的工作。