为了更好地理解Python的多处理,我将here中发现的代码改编为生产者/消费者模型。生产者将把整数推送到队列中,每个使用者将从队列中取出一个整数,将其转换为字节字符串,并使用指定的密钥对其进行加密。我不明白的是为什么与三个消费者一起运行仅比与一个消费者一起运行略快。一个消费者用了63秒,而三个消费者用了52秒。我希望它可以线性扩展,所以我认为三个消费者可以在20秒内完成它。另外,通过使用top
监视CPU使用率,我注意到生产者使用的CPU数量比消费者更多或更多,这是我不明白的,因为与消费者相比生产者的工作量并不多。我是否缺少使它成为有效的多处理器应用程序的东西?
import time
import os
import random
import binascii
from multiprocessing import Process, Queue, Lock
from Crypto.Cipher import AES # pip3 install PyCryptodome
# Producer function that places data on the Queue
def producer(queue, lock):
# Synchronize access to the console
with lock:
print('Starting producer => {}'.format(os.getpid()))
# Put integers 0 to 1000000 on the queue
i = 0
while i < 1000:
for j in range(0,1000):
queue.put(i*1000 + j)
i += 1
# Synchronize access to the console
with lock:
print('Producer finished putting {} items in queue'.format(i*1000))
# Synchronize access to the console
with lock:
print('Producer {} exiting...'.format(os.getpid()))
# The consumer function takes data off of the Queue
def consumer(queue, lock, key):
# Synchronize access to the console
with lock:
print('Starting consumer => {}'.format(os.getpid()))
rijn = AES.new(key, AES.MODE_ECB)
# Run indefinitely
while True:
# If the queue is empty, queue.get() will block until the queue has data
plaintext_int = queue.get()
plaintext_bytes = plaintext_int.to_bytes(16, 'big')
ciphertext = binascii.hexlify(rijn.encrypt(plaintext_bytes)).decode('utf-8')
if __name__ == '__main__':
# Create the Queue object
#queue = Queue(maxsize=10)
queue = Queue()
key = binascii.unhexlify('AAAABBBBCCCCDDDDEEEEFFFF00001111')
# Create a lock object to synchronize resource access
lock = Lock()
producers = []
consumers = []
# Create our producer processes by passing the producer function and it's arguments
producers.append(Process(target=producer, args=(queue, lock)))
# Create consumer processes
n_consumers = 3
for i in range(n_consumers):
p = Process(target=consumer, args=(queue, lock, key))
# This is critical! The consumer function has an infinite loop
# Which means it will never exit unless we set daemon to true
p.daemon = True
consumers.append(p)
# Start the producers and consumer
# The Python VM will launch new independent processes for each Process object
for p in producers:
p.start()
for c in consumers:
c.start()
# Like threading, we have a join() method that synchronizes our program
for p in producers:
p.join()
print('Parent process exiting...')
while not queue.empty():
print("Waiting for queue to empty. Remaining items: {qsize}".format(qsize=queue.qsize()))
time.sleep(1)
编辑
以下是各个任务的几个基准。
将int转换为字节字符串:
python3 -m timeit -s 'plaintext_int = 326543' 'plaintext_int.to_bytes(16, "big")'
1000000 loops, best of 3: 0.504 usec per loop
加密字节字符串:
python3 -m timeit -s 'from Crypto.Cipher import AES; import binascii; key=binascii.unhexlify("AAAABBBBCCCCDDDDEEEEFFFF00001111"); rijn = AES.new(key, AES.MODE_ECB); plaintext_int = 326543; plaintext_bytes=plaintext_int.to_bytes(16, "big")' 'binascii.hexlify(rijn.encrypt(plaintext_bytes)).decode("utf-8")'
1000000 loops, best of 3: 1.76 usec per loop
编辑2
分析后,等待时间最长:
按时间排序:
ncalls tottime percall cumtime percall filename:lineno(function)
7 44.590 6.370 44.590 6.370 {built-in method posix.waitpid}
1 1.002 1.002 1.002 1.002 {built-in method time.sleep}
37 0.006 0.000 0.006 0.000 {built-in method marshal.loads}
129 0.002 0.000 0.003 0.000 {built-in method builtins.__build_class__}
5 0.002 0.000 0.002 0.000 {built-in method _imp.create_dynamic}
41/1 0.001 0.000 45.631 45.631 {built-in method builtins.exec}
646 0.001 0.000 0.002 0.000 <frozen importlib._bootstrap_external>:50(_path_join)
646 0.001 0.000 0.001 0.000 <frozen importlib._bootstrap_external>:52(<listcomp>)
220 0.001 0.000 0.001 0.000 {built-in method posix.stat}
125 0.001 0.000 0.005 0.000 <frozen importlib._bootstrap_external>:1215(find_spec)