Question

这是一个关于在深度学习中预处理大型和稀疏图像的更好方法的普遍问题。

“大”是指尺寸太大（例如1024 * 1024 *个通道），无法直接馈入深度学习管道，因此不容易装入可用的GPU内存中。

“稀疏”是指检测功能未均匀分布，因此将图像切成较小的部分（例如64 * 64）可能无法达到将其定位在大图像中的目的。例如，通过拥有99张土地图片和1张房屋图片，将无法在农田中找到农舍。

我当前的解决方案是使用PIL将原件切成小块，并对假阴性（即，在前面的示例中标识为土地的农舍）进行处罚。

我想知道是否有更好的解决方案和管道来处理此类图像数据。

Answer 1

您可以编写一个自定义数据生成器以将数据提供给模型。您无需将所有图像一起加载。在训练期间，您可以一次加载单个图像或N（批量大小）个图像。使用所有图像文件名和注释数据创建文本/ CSV。

userInput = input('what department are you in right now: ')
userInput = userInput.upper() 
if userInput == 'EXIT': 
    break 
else:
     dept = userInput

您的发电机现在可以使用了。让我们训练您的模型。

def datagenerator(batch_size):
df_x = load_your_image_file_paths
df_y = load_your_annotation_data
i = 0 #pointing to data index
while True: #keeping it alive until it finishes
    x = []
    y = []
    for j in range(batch_size):
        img = Image.open(df_x[i])
        anno = df_y[i]
        final_img = preprocess_as_you_want(img)
        final_anno = preprocess_your_annotation_if_needed(anno)
        x.append(final_img)
        y.append(final_anno)
        i += 1
        '''
        df_x should be circular. this function will be called so many times depending 
        on number of epochs. make sure i never get larger than the maximum index
        '''
        if i >= len(df_x):  
            i = 0 #starting from the beginning
    yield np.array(x),np.array(y)

您可以对验证数据执行相同的操作。

在深度学习中预处理大而稀疏的图像

1 个答案: