注意:我选择使用线程来解析DNS名称,但是可以使用任何类似的操作重现相同的行为。
当我尝试将我的(以前工作的)代码从标准的单线程执行转移到多线程时,我收到了意想不到的结果。具体来说,我的代码遍历一个哈希数组,并为数组中的每个哈希添加一个键/值对。
我遇到的问题似乎来自正在创建新密钥/值对的dns_cname.map
循环。而不是具有正确值的"external_dns_entry"
密钥(即result.name.to_s
包含由DNS解析的名称),而是获取url_nameserver_mapping
中许多其他服务器之一的名称
我有一种感觉,当线程变得可用并且哈希值无序更新时,DNS解析正在发生,但我甚至不知道如何开始跟踪这样的问题。
有问题的结果:针对server1运行的DNS解析映射到服务器17.同样,服务器17正在映射到服务器99等。其余的Hash仍保持原状。
非常感谢任何帮助。首先十分感谢!
以下是未启用多线程的代码 (正常工作):
url_nameserver_mapping = { "server1" => "dallasdns.dns.com",
"server2" => "portlanddns.dns.com",
"server3" => "losangelesdns.dns.com" }
# Parse the JSON string response from the API into a valid Ruby Hash
# The net/http GET request is not shown here for brevity but it was stored in 'response'
unsorted_urls = JSON.parse(response.body)
# Sort (not sure this is relevant)
# I left it since my data is being populated to the Hash incorrectly (w/ threading enabled)
url_properties = unsorted_urls['hostnames']['items'].sort_by { |k| k["server"]}
url_nameserver_mapping.each do |server,location|
dns = Resolv::DNS.new(:nameserver => ['8.8.8.8'])
dns_cname = dns.getresources(server, Resolv::DNS::Resource::IN::CNAME)
dns_cname.map do |result|
# Create a new key/value for each Hash in url_properties Array
# Occurs if the server compared matches the value of url['server'] key
url_properties.each do |url|
url["external_dns_entry"] = result.name.to_s if url['server'] == server
end
end
end
我按照https://blog.engineyard.cm/2013/ruby-concurrency的指南来实现生产者/消费者线程模型。
启用多线程IS时,以下是我自适应代码 (不工作):
require 'thread'
require 'monitor'
thread_count = 8
threads = Array.new(thread_count)
producer_queue = SizedQueue.new(thread_count)
threads.extend(MonitorMixin)
threads_available = threads.new_cond
sysexit = false
url_nameserver_mapping = { "server1" => "dallasdns.dns.com",
"server2" => "portlanddns.dns.com",
"server3" => "losangelesdns.dns.com" }
unsorted_urls = JSON.parse(response.body)
url_properties = unsorted_urls['hostnames']['items'].sort_by { |k| k["server"]}
####################
##### Consumer #####
####################
consumer_thread = Thread.new do
loop do
break if sysexit && producer_queue.length == 0
found_index = nil
threads.synchronize do
threads_available.wait_while do
threads.select { |thread| thread.nil? ||
thread.status == false ||
thread["finished"].nil? == false}.length == 0
end
# Get the index of the available thread
found_index = threads.rindex { |thread| thread.nil? ||
thread.status == false ||
thread["finished"].nil? == false }
end
@domain = producer_queue.pop
threads[found_index] = Thread.new(@domain) do
dns = Resolv::DNS.new(:nameserver => ['8.8.8.8'])
dns_cname = dns.getresources(@domain, Resolv::DNS::Resource::IN::CNAME)
dns_cname.map do |result|
url_properties.each do |url|
url["external_dns_entry"] = result.name.to_s if url['server'] == @domain
end
end
Thread.current["finished"] = true
# Notify the consumer that another batch of work has been completed
threads.synchronize { threads_available.signal }
end
end
end
####################
##### Producer #####
####################
producer_thread = Thread.new do
url_nameserver_mapping.each do |server,location|
producer_queue << server
threads.synchronize do
threads_available.signal
end
end
sysexit = true
end
# Join on both the producer and consumer threads so the main thread doesn't exit
producer_thread.join
consumer_thread.join
# Join on the child processes to allow them to finish
threads.each do |thread|
thread.join unless thread.nil?
end
答案 0 :(得分:0)
@domain
由所有线程共享 - 这种共享是您问题的根源:当通过从队列中弹出下一个工作单元来更新它时,所有线程都会看到该更改。您可以通过
Thread.new(producer_queue.pop) do |domain|
#domain isn't shared with anyone (as long as there
#is no local variable called domain in the enclosing scope
end
与您的问题相关,但这似乎是一种真正过度设计的方法。更容易提前启动一堆消费者线程并让他们直接从工作队列中读取。