防止在Julia并行化中覆盖模块

时间:2016-08-11 20:23:09

标签: parallel-processing julia

我写了一个带有各种功能的Julia模块,我打电话来分析数据。其中一些函数依赖于包,它们包含在文件“NeuroTools.jl”的开头。

module NeuroTools

using MAT, PyPlot, PyCall;

function getHists(channels::Array{Int8,2}...

我拥有的许多函数对于并行运行很有用,所以我编写了一个驱动程序脚本,使用remotecall / fetch将函数映射到不同的线程。要在每个线程上加载函数,我使用-L选项启动Julia以在每个worker上加载我的模块。

julia -p 16 -L NeuroTools.jl parallelize.jl

要将加载的函数放入范围,“parallelize.jl”脚本具有行

@everywhere using NeuroTools

我的并行函数正常工作并执行,但每个工作线程都会从被覆盖的模块中发出一堆警告。

WARNING: replacing module MAT
WARNING: Method definition read(Union{HDF5.HDF5Dataset, HDF5.HDF5Datatype, HDF5.HDF5Group}, Type{Bool}) in module MAT_HDF5...
(contniues for many lines)

有没有办法以不同的方式加载模块或更改范围以防止所有这些警告?在这个问题上,文档似乎并不完全清楚。

2 个答案:

答案 0 :(得分:4)

巧合的是,我今天早上正在寻找same thing

(rd,wr) = redirect_stdout()

所以你需要打电话

remotecall_fetch(worker_id, redirect_stdout)

如果你想完全关闭它,这将有效

如果你想重新打开它,你可以

out = STDOUT
(a,b) = redirect_stdout()
#then to turn it back on, do:
redirect_stdout(out)

答案 1 :(得分:0)

This is fixed in the more recent releases, and @everywhere using ... is right if you really need the module in scope in all workers. This GitHub issue talks about the problem and has links to some of the other relevant discussions.

If still using older versions of Julia where this was the case, just write using NeuroTools in NeuroTools.jl after defining the module, instead of executing @everywhere using NeuroTools. The Parallel Computing section of the Julia documentation for version 0.5 says,

using DummyModule causes the module to be loaded on all processes; however, the module is brought into scope only on the one executing the statement.

Executing @everywhere using NeuroTools used to tell each processes to load the module on all processes, and the result was a pile of replacing module warnings.