Ruby - 迭代

时间:2016-06-04 00:22:31

标签: ruby producer-consumer

注意:我选择使用线程来解析DNS名称,但是可以使用任何类似的操作重现相同的行为。

当我尝试将我的(以前工作的)代码从标准的单线程执行转移到多线程时,我收到了意想不到的结果。具体来说,我的代码遍历一个哈希数组,并为数组中的每个哈希添加一个键/值对。

我遇到的问题似乎来自正在创建新密钥/值对的dns_cname.map循环。而不是具有正确值的"external_dns_entry"密钥(即result.name.to_s包含由DNS解析的名称),而是获取url_nameserver_mapping中许多其他服务器之一的名称

我有一种感觉,当线程变得可用并且哈希值无序更新时,DNS解析正在发生,但我甚至不知道如何开始跟踪这样的问题。

有问题的结果:针对server1运行的DNS解析映射到服务器17.同样,服务器17正在映射到服务器99等。其余的Hash仍保持原状。

非常感谢任何帮助。首先十分感谢!

以下是未启用多线程的代码 (正常工作):

url_nameserver_mapping = { "server1" => "dallasdns.dns.com",
                           "server2" => "portlanddns.dns.com",
                           "server3" => "losangelesdns.dns.com" }


# Parse the JSON string response from the API into a valid Ruby Hash
# The net/http GET request is not shown here for brevity but it was stored in 'response'
unsorted_urls = JSON.parse(response.body)

# Sort (not sure this is relevant)
# I left it since my data is being populated to the Hash incorrectly (w/ threading enabled)
url_properties = unsorted_urls['hostnames']['items'].sort_by { |k| k["server"]}

url_nameserver_mapping.each do |server,location|

      dns = Resolv::DNS.new(:nameserver => ['8.8.8.8'])
      dns_cname = dns.getresources(server, Resolv::DNS::Resource::IN::CNAME)

      dns_cname.map do |result|
         # Create a new key/value for each Hash in url_properties Array
         # Occurs if the server compared matches the value of url['server'] key
         url_properties.each do |url|
           url["external_dns_entry"] = result.name.to_s if url['server'] == server
         end
      end

end

我按照https://blog.engineyard.cm/2013/ruby-concurrency的指南来实现生产者/消费者线程模型。

启用多线程IS时,以下是我自适应代码 (不工作):

require 'thread'
require 'monitor'

thread_count = 8
threads = Array.new(thread_count)
producer_queue = SizedQueue.new(thread_count)
threads.extend(MonitorMixin)
threads_available = threads.new_cond
sysexit = false

url_nameserver_mapping = { "server1" => "dallasdns.dns.com",
                           "server2" => "portlanddns.dns.com",
                           "server3" => "losangelesdns.dns.com" }


unsorted_urls = JSON.parse(response.body)

url_properties = unsorted_urls['hostnames']['items'].sort_by { |k| k["server"]}

####################
##### Consumer #####
####################

consumer_thread = Thread.new do

  loop do

    break if sysexit && producer_queue.length == 0
    found_index = nil

    threads.synchronize do
      threads_available.wait_while do
        threads.select { |thread| thread.nil? ||
                                  thread.status == false ||
                                  thread["finished"].nil? == false}.length == 0
      end
      # Get the index of the available thread
      found_index = threads.rindex { |thread| thread.nil? ||
                                              thread.status == false ||
                                              thread["finished"].nil? == false }
    end

    @domain = producer_queue.pop

      threads[found_index] = Thread.new(@domain) do

        dns = Resolv::DNS.new(:nameserver => ['8.8.8.8'])
        dns_cname = dns.getresources(@domain, Resolv::DNS::Resource::IN::CNAME)

        dns_cname.map do |result|
           url_properties.each do |url|
             url["external_dns_entry"] = result.name.to_s if url['server'] == @domain
           end
        end

        Thread.current["finished"] = true

        # Notify the consumer that another batch of work has been completed
        threads.synchronize { threads_available.signal }
      end
  end
end

####################
##### Producer #####
####################

producer_thread = Thread.new do

  url_nameserver_mapping.each do |server,location|

    producer_queue << server

    threads.synchronize do
      threads_available.signal
    end
  end
  sysexit = true
end

# Join on both the producer and consumer threads so the main thread doesn't exit
producer_thread.join
consumer_thread.join

# Join on the child processes to allow them to finish
threads.each do |thread|
  thread.join unless thread.nil?
end

1 个答案:

答案 0 :(得分:0)

@domain由所有线程共享 - 这种共享是您问题的根源:当通过从队列中弹出下一个工作单元来更新它时,所有线程都会看到该更改。您可以通过

来避免此问题
Thread.new(producer_queue.pop) do |domain|
   #domain isn't shared with anyone (as long as there
   #is no local variable called domain in the enclosing scope
end

与您的问题相关,但这似乎是一种真正过度设计的方法。更容易提前启动一堆消费者线程并让他们直接从工作队列中读取。