Question

在使用resnet50模型进行培训之前，我使用以下方法预处理输入：

img = image.load_img(os.path.join(TRAIN, img), target_size=[224, 224])
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)

并保存一个numpy图像数组。我发现没有preprocess_input，数组的大小是1.5G，preprocess_input，大小是7G。这是正常的行为吗？或者我错过了什么？为什么Zero-center by mean pixel会大幅增加输入大小？

这就是在keras中定义zero center by mean pixel的方式

x = x[..., ::-1] x[..., 0] -= 103.939 x[..., 1] -= 116.779 x[..., 2] -= 123.68

Answer 1

这是因为像素值的类型为“ uint8”，而现在它们的类型为“ float”。因此，现在您有了一个图像，它是一个“浮动”数组，比“ uint8”数组大。

Answer 2

阅读preprocess_input的keras实现通过减去数据集的图像平均值来标准化图像，该平均值似乎是从图像网络获得的常数。这里是代码

def _preprocess_numpy_input(x, data_format, mode):
if mode == 'tf':
    x /= 127.5
    x -= 1.
    return x

if data_format == 'channels_first':
    if x.ndim == 3:
        # 'RGB'->'BGR'
        x = x[::-1, ...]
        # Zero-center by mean pixel
        x[0, :, :] -= 103.939
        x[1, :, :] -= 116.779
        x[2, :, :] -= 123.68
    else:
        x = x[:, ::-1, ...]
        x[:, 0, :, :] -= 103.939
        x[:, 1, :, :] -= 116.779
        x[:, 2, :, :] -= 123.68
else:
    # 'RGB'->'BGR'
    x = x[..., ::-1]
    # Zero-center by mean pixel
    x[..., 0] -= 103.939
    x[..., 1] -= 116.779
    x[..., 2] -= 123.68
return x

我并不知道为什么使用这段代码增加了数据集的大小。

Answer 3

根据TensorFlow documentation 参数是：具有3个颜色通道的浮点numpy.array或tf.Tensor，3D或4D，其值在[0，255]范围内。然后函数返回返回值：预处理的numpy.array或类型为float32的tf.Tensor。

我觉得整数使用不同的内存量。

keras中的preprocess_input大大增加了火车的大小

3 个答案: