Question

我正在尝试使用深度神经网络来检测图像上的道路标志（基于the tflearn example）：

dataset_file = [path_to_dataset_file]
X, Y = image_preloader(dataset_file, image_shape=(32, 32), mode='file',
                       categorical_labels=True, normalize=True)
X, Y = shuffle(X, Y)

network = input_data(shape=[None, 32, 32, 3])
network = conv_2d(network, 32, 3, activation='relu')
network = max_pool_2d(network, 2)
network = conv_2d(network, 64, 3, activation='relu')
network = conv_2d(network, 64, 3, activation='relu')
network = fully_connected(network, 512, activation='relu')
network = dropout(network, 0.5)
network = fully_connected(network, 2, activation='softmax')
network = regression(network)

model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, Y, n_epoch=1000, show_metric=True)

小图像（32 x 32px）效果非常好，但我想改善我的网络以处理更大的图像（如果可能的话，500 x 500px或更大），背景，角落等道路标志尝试使用shape = [None，500,500,3]运行此代码导致计算机死机：）

我正在考虑这种方式（伪代码）：

SIZE_GOOD_ENOUGH = 32

def try_detect(image):
    if image_too_small(image):  # image is too small when width
        return FALSE            # or height < SIZE_GOOD_ENOUGH
    resized_image = image.resize_to(SIZE_GOOD_ENOUGH, SIZE_GOOD_ENOUGH)
    result = detect_with_DNN(resized_image)  # returns TRUE if detected
    if result:
        return TRUE
    smaller_images_list = cut_into_pieces(image)  # list of smaller images
    for smaller_image in smaller_images_list:
        result = try_detect(smaller_image)  # recursion
        if result:
            return TRUE
    return FALSE

...或类似的东西，但我仍然希望有更大的SIZE_GOOD_ENOUGH，因为即使对我来说，一些重新调整大小的道路标志也很难识别。有没有办法改善我的网络以更好地使用（例如）200 x 200px图像？对我来说更好意味着“不要杀死我的GPU”并且仍然能够准确地生成＆gt; 0.9。也许我的conv_2d / max_pool_2d没有很好的选择？我会感激任何建议。

Answer 1

为了减少GPU内存使用量，您可以在网络开头减少更多功能映射的空间大小。为了训练更大的网络，需要拥有4G内存或更多GPU，或几个GPU。

另一点，我认为32x32的例子以道路标志为中心，而500x500的图像是道路场景，而不仅仅是标志。在这种情况下，您最好做一些像对象检测这样的事情。

如何改进深度神经网络以处理更大的输入图像？

1 个答案: