python中的并行图像处理

时间:2015-12-18 16:49:45

标签: python image-processing parallel-processing

我正在做一些图像处理,但我有很多图像(~10,000)。因此,我想并行执行,但由于某种原因,它并没有像它应该的那样快。我正在使用MacBook Pro 16Gb和i7。代码是这样的:

def process_image(img_name):
    cv2.imread('image/'+img_name)
    tfs_im = some_function(im) # use opencv, skimage and math
    cv2.imwrite("new_img/"img_name,tfs_im)

if __name__ == '__main__':
    ### Set Working Dir
    wd_path = os.path.dirname(os.path.realpath(__file__))
    os.chdir(wd_path+'/..')

    img_list = os.listdir('images')
    pool = Pool(processes=8) 
    pool.map(process_image, img_list)  # proces data_inputs iterable with pool

我还尝试了一种使用排队的更基本的方法。

def process_image(img_names):
    for img_name in img_names:
        cv2.imread('image/'+img_name)
        im = read_img(img_name)
        tfs_im = some_function(im) # use opencv, skimage and math
        cv2.imwrite('new_img/'+img_name,tfs_im)

if __name__ == '__main__':
    ### Set Working Dir
    wd_path = os.path.dirname(os.path.realpath(__file__))
    os.chdir(wd_path+'/..')

    q = Queue()
    img_list = os.listdir('image')

    # split work into 8 processes
    processes = 8
    def splitlist(inlist, chunksize):
        return [inlist[x:x+chunksize] for x in xrange(0, len(inlist), chunksize)]
    list_splitted = splitlist(img_list, len(img_list)/processes+1)

    for imgs in list_splitted:
       p = Process(target=process_image, args=([imgs]))
       p.Daemon = True
       p.start()

这些都没有达到预期的速度。我知道每个进程都需要一些设置时间,因此代码运行速度不会快8倍,但到目前为止它的运行时间比单线程快2倍。

也许某些任务不是并行化的,例如在不同进程中从/向同一文件夹写入/读取图像?

感谢您的任何提示或建议!

0 个答案:

没有答案