我正在玩python的mp库,我有几个关于性能的问题。
我的代码简单地将列表中的每个成员乘以2.我的基准标记方式非常简单。首先,我运行一个串行模式,我只使用我的函数列表1 000 000数字。对于多处理部分,我将原始列表分成许多子列表,然后执行产生一个或多个进程的计算。
如果我将原始列表分成多少个子列表,那么mp方法总是比较慢,例如。
将我的列表分成两部分并生成2个进程:
Serial in 0.256868839264秒
参加0.442973852158秒
将我的列表分成4个并产生4个进程:
Serial in 0.223025798798秒
参加了0.347413063049秒
import sys
import time
import multiprocessing as mp
# Function that multiplies avery member of list
def listsum(mylist):
for elem in mylist:
elem*2
return 1
# Function for running listsum() in parallel, the number of
# processes spawned is determined by the user
def mped(procs, vectors):
pool = mp.Pool(processes=procs)
res = [pool.apply_async(listsum, args=([vecs])) for vecs in vectors]
res = [p.get() for p in res]
return res
# Create a list of numbers, divides into n (determined by user)
# sublists and run a serial version with the original list and a
# parallel version with mp
if __name__ == "__main__":
# list with 1 000 000 numbers
myvec = []
for i in xrange(1000000):
myvec.append(i)
# Divide list in sublists
numlists = int(sys.argv[1])
segments = 1000000 / numlists
lists = []
j = 0
for i in xrange(1,numlists+1):
lists.append(myvec[j:segments*i])
j = segments*i
# profile
serialstart = time.time()
a = listsum(myvec)
serialend = time.time()
print "Serial in",serialend - serialstart,"seconds"
mpstart = time.time()
b = mped(int(sys.argv[2]),lists)
mpend = time.time()
print "MPed in",mpend - mpstart,"seconds"