Question

这是MRI中的一个错误还是对这种行为有一个很好的解释？

def pmap(enum)
  return to_enum(:pmap, enum) unless block_given?
  enum.map { |e| Thread.new { yield e } }.map(&:value)
end

# Returns elements in order, as expected.
pmap(1..10) { |e| e } #=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Returns elements in nondeterministic order on MRI >= 1.9.3.
# Works as expected on JRuby, Rubinius, and earlier versions of MRI.
pmap(1..10).to_a      #=> [7, 2, 3, 4, 6, 5, 9, 8, 10, 1]

第一个map应该返回一个线程数组，第一个线程产生1等等。

第二个map应该收集每个线程的值。

我不明白为什么结果会无序返回。

我看过what I believe is the relevant code in enum.c，但我仍然不明白为什么会这样。我怀疑这是一个性能优化出错了。或者我期待Enumerable#to_a过多（具体而言，它不会改变可枚举的顺序）？

Answer 1

line 3将枚举映射到立即返回的线程，并且与线程块的完成无关。创建后很久.value blocks on the completion of the thread。

这表明线程块的实际评估不会按顺序发生，但Thread.new初始化的结果确实发生得足够快，导致有序的Thread实例。

def pmap(enum) return to_enum(:pmap, enum) unless block_given? enum.map { |e| Thread.new { sleep(Random.rand); p e; yield e } }.map(&:value) end pmap(1..10) { |e| e } 1 2 5 6 8 3 7 9 4 10

以下是如何通过使用光纤执行块的to_enum来命令并行执行的结果：

def pmap(enum) return to_enum(:pmap, enum) unless block_given? enum.each_with_index.map { |e,i| Thread.new { sleep(Random.rand); p e; yield ({index: i, value:e }) } }.map(&:value) end p to_enum(:pmap, 1..10).sort_by { |hash| hash[:index] }.map { |hash| hash[:value] } #p pmap(1..10) { |x| x }

Answer 2

你正在从线程中屈服，并且当线程被同时执行时，很有可能它们以任意顺序产生。我的猜测是，枚举器使用一系列yield调用来形成输出。 1.9.3使用OS线程，而1.8.7具有“绿色”线程，可以解释差异。

为什么Enumerable＃to_a在Ruby版本＆gt; = 1.9.3中以这种方式运行？

2 个答案: