Question

我运行下面的代码，它会引发ValueError: 'images' contains no shape。因此，我必须在#后面添加一行来设置静态形状，但img_raw可能有不同的形状，这一行会使tf.image.resize_images失效。我只想将不同形状的图像转换为[227,227,3]。我该怎么做？

def tf_read(file_queue):
    reader = tf.WholeFileReader()
    file_name, content = reader.read(file_queue)
    img_raw = tf.image.decode_image(content,3)
    # img_raw.set_shape([227,227,3])
    img_resized = tf.image.resize_images(img_raw,[227,227])
    img_shape = tf.shape(img_resized)
    return file_name, img_resized,img_shape

Answer 1

这里的问题实际上来自于tf.image.decode_image没有返回图像形状的事实。在以下两个GitHub问题中对此进行了解释：issue1，issue2。

问题来自于tf.image.decode_image还处理.gif，其返回4D张量，而.jpg和.png返回3D图像。因此，无法返回正确的形状。

解决方案是简单地使用tf.image.decode_jpeg或tf.image.decode_png（两者都相同，可以在.png 和 .jpg图像上使用

def _decode_image(filename):
    image_string = tf.read_file(filename)
    image_decoded = tf.image.decode_jpeg(image_string, channels=3)
    image = tf.cast(image_decoded, tf.float32)
    image_resized = tf.image.resize_images(image, [224, 224])

    return image_resized

Answer 2

不，tf.image.resize_images可以处理动态形状

file_queue = tf.train.string_input_producer(['./dog1.jpg'])
# shape of dog1.jpg is (720, 720)

reader = tf.WholeFileReader()
file_name, content = reader.read(file_queue)
img_raw = tf.image.decode_jpeg(content, 3) # size (?, ?, 3)  <= dynamic h and w
# img_raw.set_shape([227,227,3])
img_resized = tf.image.resize_images(img_raw, [227, 227])
img_shape = tf.shape(img_resized)

with tf.Session() as sess:
    print img_shape.eval() #[227, 227, 3]

BTW，我使用tf v0.12，并且没有名为tf.image.decode_image的功能，但我认为这不重要

Answer 3

当然，您可以使用张量对象作为tf.image.resize_images的大小输入。

所以，通过说“将不同形状的图像转换为[227,227,3]”，我想你不想失去它们的宽高比，对吧？要实现这一点，您必须先重新缩放输入图像，然后用零填充其余部分。

但是应该注意，在填充之前，你应该考虑执行图像失真和标准化。

# Rescale so that one side of image can fit one side of the box size, then padding the rest with zeros.
# target height is 227
# target width is 227
image = a_image_tensor_you_read
shape = tf.shape(image)
img_h = shape[0]
img_w = shape[1]
box_h = tf.convert_to_tensor(target_height)
box_w = tf.convert_to_tensor(target_width)
img_ratio = tf.cast(tf.divide(img_h, img_w), tf.float32)
aim_ratio = tf.convert_to_tensor(box_h / box_w, tf.float32)
aim_h, aim_w = tf.cond(tf.greater(img_ratio, aim_ratio),
                       lambda: (box_h,
                                tf.cast(img_h / box_h * img_w, tf.int32)),
                       lambda: (tf.cast(img_w / box_w * img_h, tf.int32),
                                box_w))
image_resize = tf.image.resize_images(image, tf.cast([aim_h, aim_w], tf.int32), align_corners=True)

# Perform image standardization and distortion
image_standardized_distorted = blablabla

image_padded = tf.image.resize_image_with_crop_or_pad(image_standardized_distorted, box_h, box_w)
return image_padded

`tf.image.resize_images`的输入必须是静态形状吗？

3 个答案: