并行填充多个图像文件的numpy 3d数组

时间:2016-07-17 15:55:40

标签: python multithreading image numpy

我希望同时从磁盘上的文件加载多个灰度图像,并将它们放在一个大的numpy数组中,以加快加载时间。基本代码如下所示:

import numpy as np
import matplotlib.pyplot as plt

# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files)  # == len(mask_files)

# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)

# read images and masks
for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
    all_images[sample, :, :] = plt.imread(img_path)
    all_masks[sample, :, :] = plt.imread(mask_path)

我想并行执行此循环,但是,我知道由于GIL,Python真正的多线程功能受到限制。

你有什么想法吗?

1 个答案:

答案 0 :(得分:1)

你可以尝试为图像做一个,为面具做一个

import numpy as np
import matplotlib.pyplot as plt
from threading import Thread

# threading functions
def readImg(image_files, mask_files):
    for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
        all_images[sample, :, :] = plt.imread(img_path)

def readMask(image_files, mask_files):
    for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
        all_masks[sample, :, :] = plt.imread(mask_path)


# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files)  # == len(mask_files)

# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)

# threading stuff
image_thread = Thread(target=readImg,
                                args=[image_files, mask_files])
mask_thread = Thread(target=readMask,
                               args=[image_files, mask_files])

image_thread.daemon = True
mask_thread.daemon = True

image_thread.start()
mask_thread.start()

警告:请勿复制此代码。我也没有对此进行测试,只是为了得到它的要点。

这不会使用多个内核,也不会像上面的代码那样线性执行。如果您需要,则必须使用Queue实现。虽然,我认为这不是你想要的,因为你说你想要并发并且知道python线程上的解释器锁。

修改 - 根据您的评论,请参阅有关使用多个核心Multiprocessing vs Threading Python的帖子,使用上面的示例进行更改,只需使用该行

import multiprocessing.Process as Thread

他们共享一个类似的API。