我有一个功能,必须循环遍历图像的各个像素并计算一些几何形状。此功能需要很长时间才能运行(在24兆像素的图像上需要约5个小时),但似乎应该很容易在多个内核上并行运行。但是,我终生无法找到使用Multiprocessing软件包进行此类操作的详细记录的,经过充分说明的示例。这是我现在作为玩具示例运行的代码:
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
from skimage import color
import multiprocessing
from multiprocessing import Process
#Some dumb stand in function for this exercise
def dumb_func(image):
ny, nx = image.shape
temp = np.empty_like(image)
for y in range(ny):
for x in range(nx):
temp[y, x] = np.square(image[y, x])
return temp
#Convert image to greyscale
img = color.rgb2gray(misc.ascent())
#Resize the image
ns = 2048 #Pixel size
img = misc.imresize(img, size = (ns, ns))
#Split the image into equal chunks...not sure how this works for arrays that
#are weird shapes and aren't the same size in each dimension
divs = 4
init_split = np.array_split(img, divs, axis = 0)
side = init_split[0].shape[0]
chunked = np.empty((divs, divs, side, side))
cur = 0
for i in range(divs):
split = np.array_split(init_split[i], divs, axis = 1)
for j in range(divs):
chunked[i, j, :, :] = split[j]
cur +=1
#Pull core count and divide by two to be safe
cores = int(multiprocessing.cpu_count() / 2)
result = np.empty_like(chunked)
idxs = np.array(np.meshgrid(np.arange(0, divs, 1),
np.arange(0, divs, 1))).T.reshape(-1, 2)
基本上,此代码加载到图像中,将其转换为灰度,使其变大,然后分块。分块数组的形状为(i,j,ny,nx),其中i和j是标识正在处理的图像块的索引,并且ny,nx描述每个块的像素大小。
此外,我正在创建一个名为idxs的数组,该数组将所有可能的索引存储到分块数组中以拉出分块图像。
我想做的是在块上并行运行一个函数(在本例中为dumb_func),并将结果存储在相同形状的结果数组中。我想象的方法是循环遍历idxs数组并为进程分配属于这些索引的组块,直到核心数,等待这些核心完成,然后将更多的进程提供给核心直到完成。我之所以陷入困境,是因为我无法A)弄清楚如何访问函数中的返回值,以及B)如何处理可能有16个块和5个内核导致仅需要一个进程的最后一次迭代的情况。
我该怎么做?在过去的6到7个小时里,我一直在阅读有关“多处理池”,“流程”,“地图”,“星图”等的信息,但我一生都无法理解如何实现这一目标。
编辑Reedinationer:
这是我更新的代码,可以正常运行。但是,new_data数组永远不会更新。我用100填充了它,在例程的末尾new_data正是它的初始化方式。
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
from multiprocessing import Process, JoinableQueue
from time import time
#SOme dumb stand in function for this exercise
def dumb_func(q, new_data):
while True:
index, image = q.get()
temp = image **2
new_data[index[0], index[1], :, :] = temp
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
img = misc.ascent()
#Resize the image
ns = 2048 #Pixel size
img = misc.imresize(img, size = (ns, ns))
#Split the image into equal chunks...not sure how this works for arrays that
#are weird shapes and aren't the same size in each dimension
divs = 4
init_split = np.array_split(img, divs, axis = 0)
side = init_split[0].shape[0]
chunked = np.empty((divs, divs, side, side))
cur = 0
for i in range(divs):
split = np.array_split(init_split[i], divs, axis = 1)
for j in range(divs):
chunked[i, j, :, :] = split[j]
cur +=1
new_data = np.full(chunked.shape, 100)
idxs = np.array(np.meshgrid(np.arange(0, divs, 1),
np.arange(0, divs, 1))).T.reshape(-1, 2)
for i in range(len(idxs)):
q.put((idxs[i], chunked[idxs[i][0], idxs[i][1], :, :]))
print ('starting workers')
worker_count = len(idxs)
processes = []
for i in range(worker_count):
p = Process(target=dumb_func, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
答案 0 :(得分:2)
我将从依赖关系开始做这样的事情:
from multiprocessing import Pool
import numpy as np
from PIL import Image
# and some for testing
from random import random
from time import sleep
首先,我定义了一个将图像分成“块”的函数,就像您所说的那样:
def chunkit(ys, xs, blocksize=64):
for y in range(0, ys, blocksize):
yt = (y, min(ys, y + blocksize))
for x in range(0, xs, blocksize):
xt = (x, min(xs, x + blocksize))
yield yt, xt
这是一个惰性迭代器,因此可以持续一段时间。
然后定义我的worker函数:
def dumb_func(cc):
(y0,y1), (x0,x1) = cc
# convert to floats for ease of processing
chunk = image[y0:y1,x0:x1] / 255.
# random slow down for testing
# sleep(random() ** 6)
res = chunk ** 2
# convert back to bytes for efficiency
return cc, (res * 255).astype(np.uint8)
我确保源数组尽可能保持原始格式的效率,并以相同格式发送回去(如果您显然要处理其他像素格式,这可能会有些麻烦)。
然后我将其放在一起:
if __name__ == '__main__':
source = Image.open('tmp.jpeg')
image = np.asarray(source)
print("loaded", image.shape, image.dtype)
with Pool() as pool:
resit = pool.imap_unordered(
dumb_func, chunkit(*image.shape[:2]))
output = np.empty_like(image)
for cc, res in resit:
(y0,y1), (x0,x1) = cc
output[y0:y1,x0:x1] = res
im = Image.fromarray(output, 'RGB')
im.save('out.jpeg')
这会在几秒钟内搅动15M像素的图像,其中大部分用于加载/保存图像。数组步长和缓存友好性可能要聪明得多,但希望能有所帮助!
注意:我认为这段代码依赖于CPython Unix风格的流程派生语义,以确保在流程之间有效地共享映像。不知道如果在其他地方使用它会发生什么情况
答案 1 :(得分:1)
基本上,我一直在为相同的东西编写代码。现在的目标是用透明像素替换白色像素,但是似乎替换了整个图像,所以在某处存在一个错误……尽管multiprocessing
模块中也不再出现错误,所以也许它可以作为如何加载Queue
然后使您的工作进程工作的示例!
from PIL import Image
from multiprocessing import Process, JoinableQueue
from threading import Thread
from time import time
def worker_function(q, new_data):
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
my_image = Image.open('InputImage.jpg')
my_image = my_image.convert('RGBA')
datas = list(my_image.getdata())
new_data = [0] * len(datas) # make a blank array the size of our image to fill later
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
print('starting workers')
worker_count = 50
processes = []
for i in range(worker_count):
p = Process(target=worker_function, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
my_image.putdata(new_data)
my_image.save('output.png', "PNG")
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
我认为在if __name__ == "__main__"
块内“保护”您的代码很重要,否则生成的进程似乎可以运行它。
您似乎需要实现Manager()
(或者可能还有其他我不知道的方法!)。通过将代码更改为以下代码,可以运行我的代码:
from PIL import Image
from multiprocessing import Process, JoinableQueue, Manager
from threading import Thread
from time import time
def worker_function(q, new_data):
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
my_image = Image.open('InputImage.jpg')
my_image = my_image.convert('RGBA')
datas = list(my_image.getdata())
# new_data = [(0, 0, 0, 0)]*len(datas)
manager = Manager()
new_data = manager.list([(0, 0, 0, 0)]*len(datas))
print(new_data)
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
print('starting workers')
worker_count = 50
processes = []
for i in range(worker_count):
p = Process(target=worker_function, args=[q, new_data])
p.daemon = True
p.start()
print('main thread waiting')
q.join()
print("Saving Image")
my_image.putdata(new_data)
my_image.save('output.png', "PNG")
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
尽管这似乎不是最快的选择!我相信还有其他提高速度的方法。我对Thread
执行相同操作的代码看起来非常相似:
from PIL import Image
from threading import Thread
from queue import Queue
import time
start = time.time()
q = Queue()
planeIm = Image.open('InputImage.jpg')
planeIm = planeIm.convert('RGBA')
datas = planeIm.getdata()
new_data = [0] * len(datas)
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
def worker_function():
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
print('starting workers')
worker_count = 100
for i in range(worker_count):
t = Thread(target=worker_function)
t.daemon = True
t.start()
print('main thread waiting')
q.join()
print('Queue has been joined')
planeIm.putdata(new_data)
planeIm.save('output.png', "PNG")
end = time.time()
elapsed = end - start
print('{:3.3} seconds elapsed'.format(elapsed))
但是,使用线程处理我的图像大约需要23秒,而使用多重处理大约需要170秒!我怀疑这可能是由于启动Process
对象所需的较大开销,以及我目前处理每个像素的算法还很简单(仅if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
位),因此我很可能不会产生了速度的提高,而复杂的像素处理算法将使我受益。另请注意multiprocessing documentation
单个管理器可以由网络上不同计算机上的进程共享。但是,它们比使用共享内存要慢。
这使我相信,有些替代方法更快。