使用Concurrent.Futures.ProcessPoolExecutor同时运行&独立ABAQUS模型

时间:2016-07-29 20:51:57

标签: python multiprocessing python-multiprocessing concurrent.futures abaqus

我希望运行总共 nAnalysis = 25 Abaqus模型,每个模型使用X个核心,我可以同时运行 nParallelLoops = 5 这些模型。如果当前的5个分析之一完成,那么应该开始另一个分析,直到所有 nAnalysis 完成。

我根据 1 2 中发布的解决方案实施了以下代码。但是,我遗漏了一些东西,因为所有 nAnalysis 都试图从“一次”开始,代码死锁并且没有分析完成,因为许多人可能想要使用相同的核心比已经开始的分析正在使用。

  1. Using Python's Multiprocessing module to execute simultaneous and separate SEAWAT/MODFLOW model runs
  2. How to parallelize this nested loop in Python that calls Abaqus
  3. def runABQfile(*args):    
        import subprocess
        import os
    
        inpFile,path,jobVars = args
    
        prcStr1 = (path+'/runJob.sh')
    
        process = subprocess.check_call(prcStr1, stdin=None, stdout=None, stderr=None, shell=True, cwd=path)
    
    def safeABQrun(*args):
        import os
    
        try:
            runABQfile(*args)
        except Exception as e:
            print("Tread Error: %s runABQfile(*%r)" % (e, args))
    
    def errFunction(ppos, *args):
        import os
        from concurrent.futures import ProcessPoolExecutor
        from concurrent.futures import as_completed
        from concurrent.futures import wait
    
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(0,nAnalysis))  # 5Nodes
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    

    到目前为止我能够运行的唯一方法就是如果我修改errFunction以便在当时使用5个分析,如下所示。但是,这种方法有时会导致其中一个分析花费的时间比每个组中的其他4个(每ProcessPoolExecutor次调用)长得多,因此尽管资源可用(核心),下一组5也不会启动。最终,这将导致更多时间来完成所有25个模型。

    def errFunction(ppos, *args):
        import os
        from concurrent.futures import ProcessPoolExecutor
        from concurrent.futures import as_completed
        from concurrent.futures import wait    
    
        # Group 1
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(0,5))  # 5Nodes        
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    
        # Group 2
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(5,10))  # 5Nodes        
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    
        # Group 3
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(10,15))  # 5Nodes        
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    
        # Group 4
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(15,20))  # 5Nodes        
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    
        # Group 5
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            future_to_file = dict((executor.submit(safeABQrun, inpFiles[k], aPath[k], jobVars), k) for k in range(20,25))  # 5Nodes        
            wait(future_to_file,timeout=None,return_when='ALL_COMPLETED')
    

    我尝试使用as_completed功能,但似乎也没有用。

    请帮助确定正确的并行化,以便我可以运行 nAnalysis ,并始终 nParallelLoops 同时运行? 感谢您的帮助。 我使用的是Python 2.7

    贝斯茨, 大卫P.

    2016年7月30日更新

    我在safeABQrun中引入了一个循环,并管理了5个不同的“队列”。循环是必要的,以避免分析尝试在节点中运行而另一个仍在运行时。在开始任何实际分析之前,分析已预先配置为在其中一个请求的节点中运行。

    def safeABQrun(*list_args):
        import os
    
        inpFiles,paths,jobVars = list_args
    
        nA = len(inpFiles)
        for k in range(0,nA): 
            args = (inpFiles[k],paths[k],jobVars[k])
            try:
                runABQfile(*args) # Actual Run Function
            except Exception as e:
                print("Tread Error: %s runABQfile(*%r)" % (e, args))
    
    def errFunction(ppos, *args):
        with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
            futures = dict((executor.submit(safeABQrun, inpF, aPth, jVrs), k) for inpF, aPth, jVrs, k in list_args)  # 5Nodes
    
            for f in as_completed(futures):
                print("|=== Finish Process Train %d ===|" % futures[f])
                if f.exception() is not None:
                   print('%r generated an exception: %s' % (futures[f], f.exception()))
    

2 个答案:

答案 0 :(得分:0)

对我来说看起来不错,但我无法按原样运行您的代码。如何尝试更简单的东西,然后添加的东西,直到出现“问题”为止?例如,以下是否显示了您想要的行为?它在我的机器上运行,但我正在运行Python 3.5.2。你说你正在运行2.7,但是在Python 2中不存在do_something_useful(type: User.user_types[:web_user], user: user) def do_something_useful(options) some_enum_value = options[:type] user = options[:user] # Not a practical example. Just an example to demonstrate the issue. # Should return Hello, User! You are a web_user type. # But returns, Hello, User! You are a 1 type. 'Hello, #{user.name}! You are a #{some_enum_value} type.' end - 所以如果你使用2.7,你必须运行某人的库的后端,也许问题在于此。尝试以下内容应该有助于回答是否是这种情况:

concurrent.futures

典型输出:

from concurrent.futures import ProcessPoolExecutor, wait, as_completed

def worker(i):
    from time import sleep
    from random import randrange
    s = randrange(1, 10)
    print("%d started and sleeping for %d" % (i, s))
    sleep(s)

if __name__ == "__main__":
    nAnalysis = 25
    nParallelLoops = 5
    with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
        futures = dict((executor.submit(worker, k), k) for k in range(nAnalysis))
        for f in as_completed(futures):
            print("got %d" % futures[f])

答案 1 :(得分:0)

我在safeABQrun中引入了一个循环,并管理了5个不同的“队列”。循环是必要的,以避免分析尝试在节点中运行而另一个仍在运行时。在开始任何实际分析之前,分析已预先配置为在其中一个请求的节点中运行。

def safeABQrun(*list_args):
    import os

    inpFiles,paths,jobVars = list_args

    nA = len(inpFiles)
    for k in range(0,nA): 
        args = (inpFiles[k],paths[k],jobVars[k])
        try:
            runABQfile(*args) # Actual Run Function
        except Exception as e:
            print("Tread Error: %s runABQfile(*%r)" % (e, args))

def errFunction(ppos, *args):
    with ProcessPoolExecutor(max_workers=nParallelLoops) as executor:
        futures = dict((executor.submit(safeABQrun, inpF, aPth, jVrs), k) for inpF, aPth, jVrs, k in list_args)  # 5Nodes

        for f in as_completed(futures):
            print("|=== Finish Process Train %d ===|" % futures[f])
            if f.exception() is not None:
               print('%r generated an exception: %s' % (futures[f], f.exception()))