使用SPMD在更新公共变量时运行一系列作业

时间:2014-02-10 02:17:47

标签: matlab parallel-processing

我目前正在尝试使用MATLAB 2013b并行运行非常耗时的实验。

加快速度的一个策略是使用一个实验的结果来“热启动”下一个实验。在我的情况下,这有点复杂,因为每个实验都有n_types类型之一,我只能使用k类型的实验来加速另一个k类型的实验。

不幸的是,我无法使用parfor函数实现此策略,因为它需要每个作业更新一个公共变量(存储热启动信息)。也就是说,我听说可以使用spmd框架来实现这一点。

我想知道是否有人可以帮助我将以下通用(非工作)parfor代码块“转换”为适用于spmd代码的内容。

n_cores = %provided by user (# of workers that are available)
inputs  = %provided by user (n_jobs x 1 cell array of structs)
types   = %provided by user (n_types x 1 array of integer values)
n_jobs  = length(inputs)
n_types = length(unique(types))

outputs     = cell(n_jobs,1) %cell array to store job output
warm_starts = cell(0,n_types) %empty 0 x n_type cell array to store warm start data

matlabpool('open',n_cores)

parfor i = 1:length(jobs)

   %run myfun in parallel
   outputs{i} = myfun(inputs{i},warm_starts(types(i)));

   %update warm start data for experiments of this type with data from current experiment
   warm_starts{end+1,types(i)) = get_warm_start(job_outputs{i});

end

1 个答案:

答案 0 :(得分:1)

我不太清楚您可能希望为每个warm_starts存储多少typejobs = rand(1,97); % note prime number of jobs types = randi([1, 5], size(jobs)); n_jobs = numel(jobs); n_types = numel(unique(types)); warm_starts = cell(1, n_types); spmd jobs_per_lab = ceil(n_jobs / numlabs); outputs = cell(jobs_per_lab, 1); for idx = 1:jobs_per_lab job_idx = idx + ((labindex-1)*jobs_per_lab); if job_idx > n_jobs % Off the end of 'jobs', no work to do this_warm_start = NaN; this_type = NaN; else this_type = types(job_idx); if ~isempty(warm_starts{this_type}) this_warm_start = warm_starts{this_type}; else this_warm_start = 0; end outputs{idx} = this_warm_start + types(job_idx) * jobs(job_idx); % some function goes here this_warm_start = rand(); end % All-to-all communication to exchange 'this_warm_start' values. % After this, each worker has a 2 x numlabs cell array of warm starts and types all_warm_starts_this_round = gcat({this_type; this_warm_start}, 2); for w = 1:numlabs warm_start_type = all_warm_starts_this_round{1, w}; warm_start_value = all_warm_starts_this_round{2, w}; if ~isnan(warm_start_type) warm_starts{warm_start_type} = warm_start_value; end end end % Finally, collect all results on lab 1 outputs = gcat(outputs, 1, 1); end % Dereference the Composite outputs = outputs{1}; 。我假设你只想存储1.这是你可以这样做的方式:

{{1}}

我在那里做的主要事情是手动分割工作,以便每个工作人员操作一大块“工作”,然后使用GCAT在每一轮之后广播热启动信息。 / p>