如何并行执行函数?

时间:2015-11-30 15:44:42

标签: python parallel-processing

我试图并行调用此函数[1]。因此,我创建了这个函数[2],我称之为[3] [4]。问题在于,当我执行此代码时,执行挂起,我从未看到结果,但如果我执行串行执行run_simple_job,一切正常。为什么我不能并行执行此功能?有什么建议吗?

[1]我试图打电话的功能

@make_verbose
def run_simple_job(job_params):
  """
  Execute a job remotely, and get the digests.
  The output will come as a json file and it contains info about the  input and output path, and the generated digest.

  :param job_params: (namedtuple) contains several attributes important for the job during execution.

        client_id (string) id of the client.
        command (string) command to execute the job
        cluster (string) where the job will run
        task_type (TypeTask) contains information about the job that will run
        should_tamper (Boolean) Tells if this job should tamper the digests or not
:return : output (string) the output of the job execution

"""
client_id = job_params.client_id
_command = job_params.command
cluster = job_params.cluster
task_type = job_params.task_type

output = // execute job

return output

[2]并行调用的函数

def spawn(f):
  # 1 - how the pipe and x attributes end up here?
  def fun(pipe, x):
    pipe.send(f(x))
    pipe.close()

    return fun

def parmap2(f, X):
  pipe = [Pipe() for x in X]
  # 2 - what is happening with the tuples (c,x) and (p, c)?
  proc = [Process(target=spawn(f), args=(c, x))
        for x, (p, c) in izip(X, pipe)]

  for p in proc:
    logging.debug("Spawn")
    p.start()
  for p in proc:
    logging.debug("Joining")
    p.join()
  return [p.recv() for (p, c) in pipe]

[3]包装类

class RunSimpleJobWrapper:
  """ Wrapper used when running a job """

  def __init__(self, params):
     self.params = params

[4]我如何调用函数并行运行

for cluster in clusters:
   task_type = task_type_by_cluster[cluster]

run_wrapper_list.append(RunSimpleJobWrapper(get_job_parameter(client_id, cluster, job.command, majority(FAULTS), task_type)))

jobs_output = parmap2(run_simple_job_wrapper, run_wrapper_list)

1 个答案:

答案 0 :(得分:1)

您只需使用multiprocessing

from multiprocessing import Pool
n_jobs = -1 # use all the available CPUs
pool = Pool(n_jobs)

param_list = [...] # generate a list of your parameters


results = pool.map(run_simple_job,param_list)