Julia @spawn和pmap()涉及一个需要JuMP和Ipopt的尴尬并行问题

时间:2019-07-06 03:55:48

标签: parallel-processing julia distributed-computing

我真的很感谢在Julia中并行化以下伪代码的一些帮助(对于冗长的帖子,我会提前道歉):

P, Q   # both K by N matrix, K = num features and N = num samples
X, Y   # K*4 by N and K*2 by N matrices
tempX, tempY  # column vectors of size K*4 and K*2
ndata  # a dict from parsing a .m file to be used by a solver with JuMP and Ipopt

# serial version
for i = 1:N
    ndata[P] = P[:, i]  # technically requires a for loop from 1 to K since the dict has to be indexed element-wise
    ndata[Q] = Q[:, i]

    ndata_A = run_solver_A(ndata)  # with a third-party package and JuMP, Ipopt
    ndata_B = run_solver_B(ndata)

    kX = 1, kY = 1
    for j = 1:K
        tempX[kX:kX+3] = [ndata_A[j][a], ndata_A[j][b], P[j, i], Q[j, i]]
        tempY[kY:kY+1] = [ndata_B[j][a], ndata_B[j][b]]
        kX += 4
        kY += 2
    end
    X[:, i] = deepcopy(tempX)
    Y[:, i] = deepcopy(tempY)
end

很明显,只要没有两次访问forP的列并且{{1}的同一列Q,就可以独立执行此i循环}}和P一次被访问。我唯一需要注意的是Qi的列X是正确的YtempX对,而我不是尽可能在乎i = 1,...,N阶是否被维持(希望这是有道理的!)。

我阅读了官方文档和一些在线教程,并用tempY@spawn编写了以下内容,它们通过将占位符编号1.0替换了fetch等来用于插入部分和180:

ndata[j][a]

上面的代码很好,但我确实注意到输出始终是工作程序2-> 3-> 4-> 5-> 2 ...,并且比串行情况要慢得多(我正在笔记本电脑上对此进行测试只有4个核心,但最终我将在集群上运行它)。将它添加到using Distributed addprocs(2) num_proc = nprocs() @everywhere function insertPQ(P, Q) println(myid()) data = zeros(4*length(P)) k = 1 for i = 1:length(P) data[k:k+3] = [1.0, 180., P[i], Q[i]] k += 4 end return data end P = [0.99, 0.99, 0.99, 0.99] Q = [-0.01, -0.01, -0.01, -0.01] for i = 1:5 # should be 4 x 32 global P = hcat(P, (P .- 0.01)) global Q = hcat(Q, (Q .- 0.01)) end datas = zeros(16, 0) # serial result datap = zeros(16, 32) # parallel result @time for i = 1:32 s = fetch(@spawn insertPQ(P[:, i], Q[:, i])) global datap = hcat(datap, s) end @time for i = 1:32 k = 1 for j = 1:4 datas[k:k+3, i] = [1.0, 180., P[j, i], Q[j, i]] k += 4 end end println(datap == datas) 的{​​{1}}中后,我不得不停止运行。

对于run_solver_A/B,我不知道如何将整个向量传递给函数。我可能会误解了文档,但是“通过使用可用的工作程序和任务将f应用于每个元素来变换集合c”听起来像我只能在元素上做到这一点?那不可能。上周我去了Julia的入门会议,并问讲师这件事。他说我应该使用insertPQ(),从那以后我一直在努力使其发挥作用。

那么,如何并行化我的原始伪代码?任何帮助或建议,我们将不胜感激!

0 个答案:

没有答案