Scipy中的高性能计算,具有独立应用于大量输入的数值函数

时间:2015-12-29 00:16:50

标签: python multithreading scipy multiprocessing

我在python中有一个数值函数(基于scipy.optimize.minimize

def func(x):
   //calculation, returning 0 if done

和算法如下:

for x in X:
    run func(x) 
    terminate the loop if  one of func(x) returns 0 

上面,X是一大组双精度数,每个func(x)独立于另一个。

问题:我可以使用哪种Python的多线程/多处理功能来最大化此计算的性能?

有关信息,我使用的是多核计算机。

1 个答案:

答案 0 :(得分:1)

如果您有多个核心,那么您需要使用multiprocessing来查看收益。要从大量候选人中获得结果,您可以将其分解为批次。这个示例代码应该有助于了解该做什么。

"""
Draws on https://pymotw.com/2/multiprocessing/communication.html

"""
import multiprocessing


class Consumer(multiprocessing.Process):

    def __init__(self, task_queue, result_queue):
        multiprocessing.Process.__init__(self)
        self.task_queue = task_queue
        self.result_queue = result_queue

    def run(self):
        while True:
            next_task = self.task_queue.get()
            if next_task is None:
                # Poison pill means shutdown
                self.task_queue.task_done()
                break
            answer = next_task()
            self.task_queue.task_done()
            self.result_queue.put(answer)
        return


class Optimiser(object):

    def __init__(self, x):
        self.x = x

    def __call__(self):
        # scipy optimisation function goes here
        if self.x == 49195:
            return self.x


def chunks(iterator, n):
    """Yield successive n-sized chunks from iterator.
    http://stackoverflow.com/a/312464/1706564

    """
    for i in xrange(0, len(iterator), n):
        yield iterator[i:i+n]


if __name__ == '__main__':
    X = range(1, 50000)
    # Establish communication queues
    tasks = multiprocessing.JoinableQueue()
    results = multiprocessing.Queue()

    # Start consumers
    num_consumers = multiprocessing.cpu_count()
    consumers = [ Consumer(tasks, results)
                  for i in xrange(num_consumers) ]

    for w in consumers:
        w.start()

    chunksize = 100  # this should be sized run in around 1 to 10 seconds
    for chunk in chunks(X, chunksize):
        num_jobs = chunksize
        # Enqueue jobs
        for x in chunk:
            tasks.put(Optimiser(x))

        # Wait for all of the tasks to finish
        tasks.join()

        # Start checking results
        while num_jobs:
            result = results.get()
            num_jobs -= 1
            if result:
                # Add a poison pill to kill each consumer
                for i in xrange(num_consumers):
                    tasks.put(None)
                print 'Result:', result
                break