Question

我正在使用VGG16模型，该模型需要4D张量作为输入。当我呼叫model.fit(xtrain, ytrain, ...)时，我的xtrain是3D张量[size, size, features]的列表-因此，在这种情况下：[224,224,3]

我想要的是带有[len(images), size, size, features]的4D张量

如何修改我的代码才能到达那里？

我尝试了tf.expand_dims和tf.concant，但是没有用。

# Transforming my image to a 3D Tensor
image = tf.io.read_file(image)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [IMG_SIZE, IMG_SIZE])
image = image / 255.0

model.fit之后的错误消息：

检查输入时出错：预期input_1具有4个维度，但数组的形状为（224、224、3）

Answer 1

看起来您仅在读取单个图像并将其传递。如果是这样，您可以在图像的第一轴上添加尺寸1。有很多方法可以做到这一点。

使用reshape：

image = image.reshape(1, 224, 224, 3)

使用一些fancy numpy slicing notation添加轴（个人收藏）：

image = image[None, ...]

使用numpy.expand_dims()，如阿比吉特的回答所述。

我想您虽然想读取一堆图像。您的输入过程可能有问题吗？您可以将读取的内容包装成一个循环并读取多个文件吗？像这样：

images = []
for file in image_files:
    image = tf.io.read_file(file)
    # ...
    images.append(image)
images = np.asarray(images)

Answer 2

numpy.expand_dims(image, axis=0)

将3D张量转换为4D

2 个答案: