在循环中包装多进程池(进程之间的共享内存)

时间:2014-10-20 16:14:46

标签: python multiprocessing python-multiprocessing

我正在使用Python包" deap"用遗传算法解决一些多目标优化问题。这些功能可能非常昂贵,而且由于GA的进化性质,它很快就会复杂化。现在这个包 有一些支持,允许进化计算与多进程并行化。

但是,我想更进一步并多次运行优化,在某些优化参数上使用不同的值。例如,我可能想用不同的权重值来解决优化问题。

这似乎是循环的一个非常自然的情况,但问题是这些参数必须在程序的全局范围内定义(即,在" main"函数之上)以便所有sub -processes了解参数。这是一些伪代码:

# define deap parameters - have to be in the global scope
toolbox = base.Toolbox()
history = tools.History()
weights = [1, 1, -1] # This is primarily what I want to vary
creator.create("Fitness",base.Fitness, weights=weights)
creator.create("Individual", np.ndarray, fitness=creator.Fitness)

def main():
    # run GA to solve multiobjective optimization problem
    return my_optimized_values

if __name__=='__main__':
    ## What I'd like to do but can't ##
    ## all_weights =  list(itertools.product([1, -1],repeat=3))
    ## for combo in all_weights:
    ##     weights = combo
    ##
    pool = multiprocessing.Pool(processes=6)
    # This can be down here, and it distributes the GA computations to a pool of workers
    toolbox.register("map",pool.map) 
    my_values = main()

我已经研究了各种可能性,例如multiprocessing.Value,多处理的悲惨分支等等,但最终总是出现了阅读Individual类的子进程的问题。

我已经在deap用户身上提出了这个问题'小组,但它并不像SO那样大。另外,在我看来,这更像是一个概念性的Python问题,而不是特定的deap问题。我目前解决这个问题的方法是多次运行代码并每次更改一些参数定义。至少通过这种方式,GA计算仍然是并行化的,但它确实需要比我更喜欢的人工干预。

非常感谢任何建议或建议!

2 个答案:

答案 0 :(得分:0)

使用Poolinitializer / initargs关键字参数为每次运行时需要更改的全局变量传递不同的值。 initializer函数将在initargs启动后立即调用Pool作为其Pool内每个工作进程的参数。您可以将全局变量设置为所需的值,并且可以在池的生命周期内在每个子项内正确设置它们。

您需要为每次运行创建不同的toolbox = base.Toolbox() history = tools.History() weights = None # We'll set this in the children later. def init(_weights): # This will run in each child process. global weights weights = _weights creator.create("Fitness",base.Fitness, weights=weights) creator.create("Individual", np.ndarray, fitness=creator.Fitness) if __name__=='__main__': all_weights = list(itertools.product([1, -1],repeat=3)) for combo in all_weights: weights = combo pool = multiprocessing.Pool(processes=6, initializer=init, initargs=(weights,)) toolbox.register("map",pool.map) my_values = main() pool.close() pool.join() ,但这应该不是问题:

{{1}}

答案 1 :(得分:0)

我也对DEAP使用全球范围感到不安,我想我有一个替代解决方案。

每次循环迭代可以导入每个模块的不同版本,从而避免依赖全局范围。

this_random = importlib.import_module("random")
this_creator = importlib.import_module("deap.creator")
this_algorithms = importlib.import_module("deap.algorithms")
this_base = importlib.import_module("deap.base")
this_tools = importlib.import_module("deap.tools")

据我所知,这似乎与多处理有关。

例如,这是DEAP的onemax_mp.py版本,它避免将任何DEAP文件放在全局范围内。我在__main__中包含了一个循环,用于更改每次迭代的权重。 (它第一次最大化了1的数量,并且第二次最小化它。)多处理一切正常。

#!/usr/bin/env python2.7
#    This file is part of DEAP.
#
#    DEAP is free software: you can redistribute it and/or modify
#    it under the terms of the GNU Lesser General Public License as
#    published by the Free Software Foundation, either version 3 of
#    the License, or (at your option) any later version.
#
#    DEAP is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
#    GNU Lesser General Public License for more details.
#
#    You should have received a copy of the GNU Lesser General Public
#    License along with DEAP. If not, see <http://www.gnu.org/licenses/>.

import array
import multiprocessing
import sys

if sys.version_info < (2, 7):
    print("mpga_onemax example requires Python >= 2.7.")
    exit(1)

import numpy
import importlib


def evalOneMax(individual):
    return sum(individual),


def do_onemax_mp(weights, random_seed=None):
    """ Run the onemax problem with the given weights and random seed. """

    # create local copies of each module
    this_random = importlib.import_module("random")
    this_creator = importlib.import_module("deap.creator")
    this_algorithms = importlib.import_module("deap.algorithms")
    this_base = importlib.import_module("deap.base")
    this_tools = importlib.import_module("deap.tools")

    # hoisted from global scope
    this_creator.create("FitnessMax", this_base.Fitness, weights=weights)
    this_creator.create("Individual", array.array, typecode='b',
                        fitness=this_creator.FitnessMax)
    this_toolbox = this_base.Toolbox()
    this_toolbox.register("attr_bool", this_random.randint, 0, 1)
    this_toolbox.register("individual", this_tools.initRepeat,
                          this_creator.Individual, this_toolbox.attr_bool, 100)
    this_toolbox.register("population", this_tools.initRepeat, list,
                          this_toolbox.individual)
    this_toolbox.register("evaluate", evalOneMax)
    this_toolbox.register("mate", this_tools.cxTwoPoint)
    this_toolbox.register("mutate", this_tools.mutFlipBit, indpb=0.05)
    this_toolbox.register("select", this_tools.selTournament, tournsize=3)

    # hoisted from __main__
    this_random.seed(random_seed)
    pool = multiprocessing.Pool(processes=4)
    this_toolbox.register("map", pool.map)
    pop = this_toolbox.population(n=300)
    hof = this_tools.HallOfFame(1)
    this_stats = this_tools.Statistics(lambda ind: ind.fitness.values)
    this_stats.register("avg", numpy.mean)
    this_stats.register("std", numpy.std)
    this_stats.register("min", numpy.min)
    this_stats.register("max", numpy.max)

    this_algorithms.eaSimple(pop, this_toolbox, cxpb=0.5, mutpb=0.2, ngen=40,
                             stats=this_stats, halloffame=hof)

    pool.close()

if __name__ == "__main__":
    for tgt_weights in ((1.0,), (-1.0,)):
        do_onemax_mp(tgt_weights)