我试图通过运行下面的代码来理解 python 中线程和进程之间的区别。
from numba import jit
import random
import time
import concurrent.futures
@jit(nopython=True, nogil=True)
def monte_carlo_pi(nsamples):
acc = 0
for i in range(nsamples):
x = random.random()
y = random.random()
if (x**2 + y**2) < 1.0:
acc += 1
return 4.0 * acc / nsamples
if __name__ == '__main__':
nparl = 6 # Number of parallel processes/threads
print("Number of parallel processes/threads: ", nparl)
n_in = int(4e6) # input for monte_carlo_pi
_ = monte_carlo_pi(n_in) #use once for jit
print('********************************')
print('1 - SERIAL')
tini = time.perf_counter()
out1 = [None]*nparl
for i in range(nparl):
out1[i] = monte_carlo_pi(n_in)
print("pi = ", sum(out1)/nparl)
tend = time.perf_counter()
print("Time elapsed: ", tend-tini)
print('***************************************')
print('2 - MULTI THREAD WITH CONCURRENT.FUTURES')
tini = time.perf_counter()
thread = [None]*nparl
out2 = [None]*nparl
with concurrent.futures.ThreadPoolExecutor() as executor:
for i in range(nparl):
thread[i] = executor.submit(monte_carlo_pi, n_in)
out2[i]=thread[i].result()
print("pi = ", sum(out2)/nparl)
tend = time.perf_counter()
print("Time elapsed: ", tend-tini)
print('***************************************')
print('3 - MULTI PROCESSES WITH CONCURRENT.FUTURES')
tini = time.perf_counter()
process = [None]*nparl
out3 = [None]*nparl
with concurrent.futures.ProcessPoolExecutor() as executor:
for i in range(nparl):
process[i] = executor.submit(monte_carlo_pi, n_in)
out3[i]=process[i].result()
print("pi = ", sum(out3)/nparl)
tend = time.perf_counter()
print("Time elapsed: ", tend-tini)
print('***************************************')
我从 numba 站点获取了函数 monte_carlo_pi()
,它随机计算了 ? 的近似值。我在 Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz 上使用 Spyder(Python 3.8),6 核,Windows 10。
我对输出有点困惑:
Number of parallel processes/threads: 6
********************************
1 - SERIAL
pi = 3.14171
Time elapsed: 0.22061820000089938
***************************************
2 - MULTI THREAD WITH CONCURRENT.FUTURES
pi = 3.1416696666666666
Time elapsed: 0.22931740000058198
***************************************
3 - MULTI PROCESSES WITH CONCURRENT.FUTURES
pi = 3.141372833333333
Time elapsed: 3.4144941000013205
***************************************
我想知道是否有人可以帮我回答几个问题。
nogil=True,
,每个线程都会使用不同的核心nparl = 3 # Number of parallel processes/threads
而不是 nparl = 6
情况 1 和 2 的经过时间按比例缩放,而情况 3 则小于按比例缩放
Number of parallel processes/threads: 3
********************************
1 - SERIAL
pi = 3.1418706666666663
Time elapsed: 0.1084104000001389
***************************************
2 - MULTI THREAD WITH CONCURRENT.FUTURES
pi = 3.142234666666667
Time elapsed: 0.10978269999941404
***************************************
3 - MULTI PROCESSES WITH CONCURRENT.FUTURES
pi = 3.1407256666666665
Time elapsed: 2.4338965000006283
***************************************
我确定我在这里遗漏了一些非常基本的东西,我想知道是否有人可以指出我正确的方向。
答案 0 :(得分:0)
您不是同时运行它,而是一个接一个地依次循环。第一调度作业,然后等待结果:
$ diff -u test.py.orig test.py
--- test.py.orig 2021-04-04 16:09:30.986336091 +0100
+++ test.py 2021-04-04 16:08:56.097523834 +0100
@@ -35,7 +35,8 @@
with concurrent.futures.ThreadPoolExecutor() as executor:
for i in range(nparl):
thread[i] = executor.submit(monte_carlo_pi, n_in)
- out2[i]=thread[i].result()
+ # out2[i]=thread[i].result()
+ out2 = [i.result() for i in thread]
print("pi = ", sum(out2)/nparl)
tend = time.perf_counter()
print("Time elapsed: ", tend-tini)
@@ -47,7 +48,8 @@
with concurrent.futures.ProcessPoolExecutor() as executor:
for i in range(nparl):
process[i] = executor.submit(monte_carlo_pi, n_in)
- out3[i]=process[i].result()
+ # out3[i]=process[i].result()
+ out3 = [i.result() for i in process]
print("pi = ", sum(out3)/nparl)
tend = time.perf_counter()
print("Time elapsed: ", tend-tini)
顺便说一句,不需要在 Python 中初始化数组(列表)变量,所以这样的事情应该可以解决问题:
with concurrent.futures.ThreadPoolExecutor() as executor:
threads = [executor.submit(monte_carlo_pi, n_in) for _ in range(nparl)]
out2 = [t.result() for t in threads]