我在python中有一个数值函数(基于scipy.optimize.minimize
)
def func(x):
//calculation, returning 0 if done
和算法如下:
for x in X:
run func(x)
terminate the loop if one of func(x) returns 0
上面,X是一大组双精度数,每个func(x)独立于另一个。
问题:我可以使用哪种Python的多线程/多处理功能来最大化此计算的性能?
有关信息,我使用的是多核计算机。
答案 0 :(得分:1)
如果您有多个核心,那么您需要使用multiprocessing
来查看收益。要从大量候选人中获得结果,您可以将其分解为批次。这个示例代码应该有助于了解该做什么。
"""
Draws on https://pymotw.com/2/multiprocessing/communication.html
"""
import multiprocessing
class Consumer(multiprocessing.Process):
def __init__(self, task_queue, result_queue):
multiprocessing.Process.__init__(self)
self.task_queue = task_queue
self.result_queue = result_queue
def run(self):
while True:
next_task = self.task_queue.get()
if next_task is None:
# Poison pill means shutdown
self.task_queue.task_done()
break
answer = next_task()
self.task_queue.task_done()
self.result_queue.put(answer)
return
class Optimiser(object):
def __init__(self, x):
self.x = x
def __call__(self):
# scipy optimisation function goes here
if self.x == 49195:
return self.x
def chunks(iterator, n):
"""Yield successive n-sized chunks from iterator.
http://stackoverflow.com/a/312464/1706564
"""
for i in xrange(0, len(iterator), n):
yield iterator[i:i+n]
if __name__ == '__main__':
X = range(1, 50000)
# Establish communication queues
tasks = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
# Start consumers
num_consumers = multiprocessing.cpu_count()
consumers = [ Consumer(tasks, results)
for i in xrange(num_consumers) ]
for w in consumers:
w.start()
chunksize = 100 # this should be sized run in around 1 to 10 seconds
for chunk in chunks(X, chunksize):
num_jobs = chunksize
# Enqueue jobs
for x in chunk:
tasks.put(Optimiser(x))
# Wait for all of the tasks to finish
tasks.join()
# Start checking results
while num_jobs:
result = results.get()
num_jobs -= 1
if result:
# Add a poison pill to kill each consumer
for i in xrange(num_consumers):
tasks.put(None)
print 'Result:', result
break