如何使用自定义训练循环保存Tensorflow 2.2.0模型?

时间:2020-05-24 12:45:21

标签: python tensorflow keras

我正在努力保存一个tf.keras模型以轻松加载和使用它。我已经使用tf.keras.Model子类方法来构建具有自定义损失函数的MLP模型,如下所示:

class MyModel(tf.keras.Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = Dense(400, activation='relu', kernel_initializer=initializers.glorot_uniform(), input_dim=5)
        self.dense2 = Dense(400, activation='relu', kernel_initializer=initializers.glorot_uniform())
        self.dense3 = Dense(400, activation='relu', kernel_initializer=initializers.glorot_uniform())
        self.dense4 = Dense(400, activation='relu', kernel_initializer=initializers.glorot_uniform())
        self.dense_out = Dense(1, activation='relu', kernel_initializer=initializers.glorot_uniform())

    @tf.function(input_signature=[tf.TensorSpec(shape=(None, 5), dtype=tf.float32, name='inputs')])   #CHECK tf.saved_model.save docs!
    def call(self, inputs, **kwargs):
        x = self.dense1(inputs)
        x = self.dense2(x)
        x = self.dense3(x)
        x = self.dense4(x)
        return self.dense_out(x)

    def get_loss(self, X, Y):
        with tf.GradientTape() as tape:
            tape.watch(tf.convert_to_tensor(X))
            Y_pred = self.call(X)
        return tf.reduce_mean(tf.math.square(Y_pred-Y)) + tf.reduce_mean(tf.maximum(0, tape.gradient(Y_pred, X)[:, 2]))

    def get_grad_and_loss(self, X, Y):
        with tf.GradientTape() as tape:
            tape.watch(tf.convert_to_tensor(X))
            L = self.get_loss(X, Y)
        g = tape.gradient(L, self.trainable_weights)
        return g, L

然后,我创建模型的一个实例,并进行标准的训练循环:

model = MyModel()
epochs = 5
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-5, beta_1=0.9, beta_2=0.999)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.25)
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch)
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.shuffle(buffer_size=1024).batch(batch)
val_acc_metric = tf.keras.metrics.MeanAbsoluteError()


## TRAINING LOOP
losses = []
for epoch in range(epochs):
    print(f'############ START OF EPOCH {epoch + 1} ################')
    for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
        grads, L = model.get_grad_and_loss(x_batch_train, y_batch_train)
        losses.append(float(L))
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

        if step % 100 == 0:
            print('Training loss (for one batch) at step %s: %s' % (step, float(L)))
            print(f'Seen so far: {(step+1)*batch} samples')

    # Run a validation loop at the end of each epoch.
    for val_step, (x_batch_val, y_batch_val) in enumerate(val_dataset):
        val_logits = model.call(x_batch_val)
        # Update val metrics
        val_acc_metric(y_batch_val, val_logits)
    val_acc = val_acc_metric.result()
    val_acc_metric.reset_states()
    print(f'Validation acc: {val_acc}')

我尝试遵循here概述的步骤。我在随机输入上调用模型以在内部触发model.build,然后尝试使用以下命令保存模型:

model.save('mymodel', signatures=model.call.get_concrete_function([tf.TensorSpec(shape=(None, 5), dtype=tf.float32, name='inputs')]))

然后我得到以下错误:

Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/Maximocravero/opt/miniconda3/envs/finance_research/lib/python3.8/site- 
packages/tensorflow/python/eager/def_function.py", line 959, in get_concrete_function
concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
File "/Users/Maximocravero/opt/miniconda3/envs/finance_research/lib/python3.8/site- 
packages/tensorflow/python/eager/def_function.py", line 871, in 
_get_concrete_function_garbage_collected
return self._stateless_fn._get_concrete_function_garbage_collected(  # pylint: 
disable=protected-access
File "/Users/Maximocravero/opt/miniconda3/envs/finance_research/lib/python3.8/site- 
packages/tensorflow/python/eager/function.py", line 2480, in 
_get_concrete_function_garbage_collected
raise ValueError("Structure of Python function inputs does not match "
ValueError: Structure of Python function inputs does not match input_signature.

我不理解此问题,因为我在model.call属性上方的tf.function中指定了相同的TensorSpec。我没有在模型调用上方包含tf.function的情况下尝试了此操作,这导致与必须设置输入尺寸有关的错误。我可以通过在任意输入上调用模型来解决此问题,这确实允许我保存模型,但是在使用它之前必须对其进行编译,并且会收到以下警告:

WARNING:tensorflow:From 
/Users/Maximocravero/opt/miniconda3/envs/finance_research/lib/python3.8/site- 
packages/tensorflow/python/ops/resource_variable_ops.py:1813: calling 
BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with 
constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

我的问题是我是否完全遗漏了某些东西,还是通常必须编译加载的自定义模型?我正在使用Python 3.8.2运行TensorFlow 2.2.0,并且基于documentation用于保存模型,这实际上应该非常简单。我是TensorFlow的新手,所以很可能这是一个愚蠢的错误,但最终它仍然是具有5个输入和单个输出的基本模型。任何帮助将不胜感激。

0 个答案:

没有答案