Question

我正在使用来自keras应用程序的VGG19模型。我原本希望图像可以缩放到[-1, 1]，但是相反，看来preprocess_input正在做其他事情。

要预处理输入，我使用以下两行代码首先加载图像，然后缩放图像：

from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input

img = image.load_img("./img.jpg", target_size=(256, 256))
img = preprocess_input(np.array(img))

print(img)
>>> array([[[151.061  , 138.22101, 131.32   ],
    ... ]]]

输出似乎在[0,255]间隔内，但是，原始255s映射到151附近的值（可能居中）。 VGG实际需要什么输入？通过查看源代码（对于mode='tf'），我认为它应该位于[-1,1]中。它是否非常灵活，我可以使用所需的任何缩放比例？（我正在使用VGG提取中级特征-Conv4块。）

查看preprocess_input的源代码时，我看到：

...
    if mode == 'tf':
        x /= 127.5
        x -= 1.
        return x
...

这表明对于tensorflow后端（keras使用的是后端），应将其缩放为[-1,1]。

我需要做的是创建一个函数restore_original_image_from_array()，该函数将使用img并重建输入的原始图像。问题是我不确定缩放如何发生VGG19。

简而言之，我想这样做：

img = image.load_img("./img.jpg", target_size=(256, 256))
scaled_img = preprocess_input(np.array(img))
restore_original_image_from_array(scaled_img) == np.array(img)
>>> True

Answer 1

preprocess_input函数的“模式”取决于训练了预训练网络权重的框架。 Keras中的VGG19网络使用来自caffe中原始VGG19模型的权重，因此，preprocess_input中的参数应为默认值（mode='caffe'）。看到以下问题：Keras VGG16 preprocess_input modes

出于您的目的，请使用preprocess_input中的keras.applications.vgg19函数，然后从那里进行反向工程。

原始预处理程序位于：https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py#L21

这涉及1）将图像从RGB转换为BGR 2）从图像中减去数据集平均值

以下是恢复原始图像的代码：

def restore_original_image_from_array(x, data_format='channels_first'):
    mean = [103.939, 116.779, 123.68]

    # Zero-center by mean pixel
    if data_format == 'channels_first':
        if x.ndim == 3:
            x[0, :, :] += mean[0]
            x[1, :, :] += mean[1]
            x[2, :, :] += mean[2]
        else:
            x[:, 0, :, :] += mean[0]
            x[:, 1, :, :] += mean[1]
            x[:, 2, :, :] += mean[2]
    else:
        x[..., 0] += mean[0]
        x[..., 1] += mean[1]
        x[..., 2] += mean[2]

    if data_format == 'channels_first':
        # 'BGR'->'RGB'
        if x.ndim == 3:
            x = x[::-1, ...]
        else:
            x = x[:, ::-1, ...]
    else:
        # 'BGR'->'RGB'
        x = x[..., ::-1]

    return x

Answer 2

在图像上训练VGG网络，每个通道均值均值[103.939、116.779、123.68]归一化，通道BGR。此外，由于优化后的图像的取值可能介于-∞和∞之间，因此我们必须进行裁剪以将值保持在0-255范围内。
这是对处理后的图像进行“反处理”或逆处理的代码：

def deprocess_img(processed_img):
  x = processed_img.copy()
  if len(x.shape) == 4:
    x = np.squeeze(x, 0)
  assert len(x.shape) == 3, ("Input to deprocess image must be an image of "
                             "dimension [1, height, width, channel] or [height, width, channel]")
  if len(x.shape) != 3:
    raise ValueError("Invalid input to deprocessing image")
  
  # perform the inverse of the preprocessiing step
  x[:, :, 0] += 103.939
  x[:, :, 1] += 116.779
  x[:, :, 2] += 123.68
  x = x[:, :, ::-1]

  x = np.clip(x, 0, 255).astype('uint8')
  return x

在Keras中反转VGG的图像预处理以返回原始图像

2 个答案: