在热切的执行编程环境中训练定制的CNN模型

时间:2019-02-23 00:44:56

标签: tensorflow keras eager-execution

我使用Keras中的“模型子分类”原理构建了CNN模型。这是代表我的模型的类:

class ConvNet(tf.keras.Model):

    def __init__(self, data_format, classes):

        super(ConvNet, self).__init__()

        if data_format == "channels_first":
            axis = 1
        elif data_format == "channels_last":
            axis = -1

        self.conv_layer1 = tf.keras.layers.Conv2D(filters = 32, kernel_size = 3,strides = (1,1),
                                                  padding = "same",activation = "relu")
        self.pool_layer1 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
        self.conv_layer2 = tf.keras.layers.Conv2D(filters = 64, kernel_size = 3,strides = (1,1),
                                                  padding = "same",activation = "relu")
        self.pool_layer2 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
        self.conv_layer3 = tf.keras.layers.Conv2D(filters = 128, kernel_size = 5,strides = (1,1),
                                                  padding = "same",activation = "relu")
        self.pool_layer3 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (1,1),
                                                       padding = "same")
        self.flatten = tf.keras.layers.Flatten()
        self.dense_layer1 = tf.keras.layers.Dense(units = 512, activation = "relu")
        self.dense_layer2 = tf.keras.layers.Dense(units = classes, activation = "softmax")

    def call(self, inputs, training = True):

        output_tensor = self.conv_layer1(inputs)
        output_tensor = self.pool_layer1(output_tensor)
        output_tensor = self.conv_layer2(output_tensor)
        output_tensor = self.pool_layer2(output_tensor)
        output_tensor = self.conv_layer3(output_tensor)
        output_tensor = self.pool_layer3(output_tensor)
        output_tensor = self.flatten(output_tensor)
        output_tensor = self.dense_layer1(output_tensor)

        return self.dense_layer2(output_tensor)

我想知道如何“急切地”训练它,这就是说避免使用compilefit方法。

我不确定如何准确地构建训练循环。我知道必须执行tf.GradientTape.gradient()函数才能计算出梯度,然后使用optimizers.apply_gradients()来更新模型参数。

我不了解的是如何使用模型进行预测,以获得logits,然后使用它们来计算损失。如果有人可以在构建培训循环方面帮助我,我将不胜感激。

1 个答案:

答案 0 :(得分:1)

急于执行是使开发人员能够遵循Python的自然控制流程的命令式编程模式。本质上,您无需先创建占位符,计算图,然后在TensorFlow会话中执行它们。您可以在训练循环中使用自动微分来计算梯度:

for i in range(iterations):
  with tf.GradientTape() as tape:
    logits = model(batch_examples, training = True)
    loss = tf.losses.sparse_softmax_cross_entropy(batch_labels, logits)
  grads = tape.gradient(loss, model.trainable_variables)
  opt.apply_gradients([grads, model.trainable_variables])

这是假设model是Keras的Model类。希望这能解决您的问题!您还应该查看TensorFlow Guide关于急切执行。