julia-lang使用@async在并行线程中缓存数据

时间:2017-01-30 01:52:11

标签: parallel-processing async-await julia

假设我们有一个生成数据的慢函数和另一个处理数据的慢函数,如下所示:

# some slow function
function prime(i)
  sleep(2)
  println("processed $i")
  i
end

function slow_process(x)
  sleep(2)
  println("slow processed $x")
end

function each(rng)
  function _iter()
    for i ∈ rng
      @time d = prime(i)
      produce(d)
    end
  end
  return Task(_iter)
end

@time for x ∈ each(1000:1002)
  slow_process(x)
end

输出:

% julia test-task.jl
processed 1000
  2.063938 seconds (37.84 k allocations: 1.605 MB)
slow processed 1000
processed 1001
  2.003115 seconds (17 allocations: 800 bytes)
slow processed 1001
processed 1002
  2.001798 seconds (17 allocations: 800 bytes)
slow processed 1002
 12.166475 seconds (88.08 k allocations: 3.640 MB)

有没有办法使用@async在并行线程中获取和缓存数据并将其提供给slow_process函数?

编辑:我更新了示例以澄清问题。理想情况下,该示例应该花费2 + 6秒而不是12秒。

编辑2:这是我使用@sync和@async的努力,但我收到错误ERROR (unhandled task failure): no process with id 2 exists

macro swap(x,y)
  quote
    local tmp = $(esc(x))
    $(esc(x)) = $(esc(y))
    $(esc(y)) = tmp
  end
end

# some slow function
function prime(i)
  sleep(2)
  println("processed $i")
  i
end

function slow_process(x)
  sleep(2)
  println("slow processed $x")
end

function each(rng)
  @assert length(rng) > 1
  rng = collect(rng)
  a = b = nothing
  function _iter()
    for i ∈ 1:length(rng)
      if a == nothing
        a = @async remotecall_fetch(prime, 2, rng[i])
        b = @async remotecall_fetch(prime, 2, rng[i+1])
      else
        if i < length(rng)
          a = @async remotecall_fetch(prime, 2, rng[i+1])
        end
        @swap(a,b)
      end
      @sync d = a
      produce(d)
    end
  end
  return Task(_iter)
end

@time for x ∈ each(1000:1002)
  slow_process(x)
end

1 个答案:

答案 0 :(得分:0)

好的,我有以下工作解决方案:

macro swap(x,y)
  quote
    local tmp = $(esc(x))
    $(esc(x)) = $(esc(y))
    $(esc(y)) = tmp
  end
end

# some slow function
@everywhere function prime(i)
  sleep(2)
  println("prime $i")
  i
end

function slow_process(x)
  sleep(2)
  println("slow_process $x")
end

function each(rng)
  @assert length(rng) > 1
  rng = collect(rng)
  a = b = nothing
  function _iter()
    for i ∈ 1:length(rng)
      if a == nothing
        a = remotecall(prime, 2, rng[i])
        b = remotecall(prime, 2, rng[i+1])
      else
        if i < length(rng)
          a = remotecall(prime, 2, rng[i+1])
        end
        @swap(a,b)
      end
      d = fetch(a)
      produce(d)
    end
  end
  return Task(_iter)
end
@time for x ∈ each(1000:1002)
  slow_process(x)
end

% julia -p 2 test-task.jl
8.354102 seconds (148.00 k allocations: 6.204 MB)