Question

我在 tensorflow2 中创建了以下卷积自编码器（见下文）：

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras import layers

image_height=480
image_width=640

class Autoencoder(Model):
  def __init__(self):
    super(Autoencoder, self).__init__()
    self.encoder = tf.keras.Sequential([
        layers.InputLayer(input_shape=(image_height, image_width, 1), name="layer1"),
        layers.Conv2D(16, (3, 3), activation='relu', name="layer2"),
        layers.MaxPooling2D(pool_size=(2, 2), name="layer3"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer4"),
        layers.MaxPooling2D(pool_size=(2, 2), name="layer5"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer6"),
        layers.MaxPooling2D(pool_size=(2, 2), name="layer7"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer8"),
        layers.MaxPooling2D(pool_size=(2, 2), name="layer9"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer10"),
        layers.MaxPooling2D(pool_size=(2, 2), name="layer11"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer12")
    ])
    self.decoder = tf.keras.Sequential([
        layers.UpSampling2D(size=(2,2), name="layer13"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer14"),
        layers.UpSampling2D(size=(2,2), name="layer15"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer16"),
        layers.UpSampling2D(size=(2,2), name="layer17"),
        layers.Conv2D(8, (3, 3), activation='relu', name="layer18"),
        layers.UpSampling2D(size=(2,2), name="layer19"),
        layers.Conv2D(16, (3, 3), activation='relu', name="layer20"),
        layers.UpSampling2D(size=(2,2), name="layer21"),
        layers.Conv2D(1, (3, 3), activation='relu', name="layer22")
    ])
    self._model = Model()

  def call(self, x):
    encoded = self.encoder(x)
    decoded = self.decoder(encoded)
    return decoded

我还将我的图像数据分为两个独立的数据集：

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
        'path/to/imagedir',
        validation_split=0.2,
        label_mode=None,
        subset="training", 
        seed=123,
        image_size=(image_height,image_width),
        color_mode="grayscale"
    )
    val_ds = tf.keras.preprocessing.image_dataset_from_directory(
        'path/to/imagedir',
        validation_split=0.2,
        label_mode=None,
        subset="validation",
        seed=123,
        image_size=(image_height,image_width),
        color_mode="grayscale"
    )

在我创建自动编码器并编译之后：

autoencoder = Autoencoder()
autoencoder.compile(loss='binary_crossentropy')

我想训练它：

autoencoder.fit(train_ds, train_ds, epochs=10, validation_data=(val_ds,val_ds))

很遗憾，我收到以下错误：

raise ValueError("`y` argument is not supported when using "
ValueError: `y` argument is not supported when using dataset as input.

问题是当 y 参数也是数据集时，fit 函数无法接收数据集作为 x 参数。我也无法将图像保存为张量列表，因为我的数据集太大。

Answer 1

抱歉创建了答案而不是评论，我没有足够的声誉。

尝试构建您自己的自定义 train() 和 train_step() 函数，就像在 Tensorflow 教程中一样：https://www.tensorflow.org/tutorials/generative/cvae

Answer 2

我现在已将以下代码添加到我的模型中：

def train_step(self, data):
    # Unpack the data. Its structure depends on your model and
    # on what you pass to `fit()`.
    x = data

    with tf.GradientTape() as tape:
      y_pred = self(x, training=True)  # Forward pass
      # Compute the loss value
      # (the loss function is configured in `compile()`)
      loss = self.compiled_loss(x, y_pred, regularization_losses=self.losses)

    # Compute gradients
    trainable_vars = self.trainable_variables
    gradients = tape.gradient(loss, trainable_vars)
    # Update weights
    self.optimizer.apply_gradients(zip(gradients, trainable_vars))
    # Update metrics (includes the metric that tracks the loss)
    self.compiled_metrics.update_state(x, y_pred)
    # Return a dict mapping metric names to current value
    return {m.name: m.result() for m in self.metrics}

现在在fit中使用它：

autoencoder.fit(train_ds, epochs=10)

我现在收到以下错误：

ValueError: Dimensions must be equal, but are 480 and 290 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](IteratorGetNext, binary_crossentropy/Log)' with input shapes: [?,480,640,1], [?,290,450,1].

我该如何解决这个问题？

如何在 tensorflow 2.0 中训练卷积自编码器？

2 个答案: