定制的TF2模型保存

时间:2020-01-25 15:50:01

标签: tensorflow keras tensorflow2.0 tf.keras tensorflow2.x

我用TF2编写自定义模型

class NN(tf.keras.Model):

def __init__(self,
             output_dim: int, 
             controller_dime:int=128,
             interface_dim: int=35,
             netsize: int=100, 
             degree: int=20, 
             k:float=2,
             name:str='dnc_rn')->None:

它充满了不可训练的随机参数! 所以我需要完全保存模型,并且我不能使用save_weights,因为每种模型的训练取决于其自身的随机参数...


培训师的档案如下:

import numpy as np
import tensorflow as tf

def trainer(model: tf.keras.Model,
        loss_fn: tf.keras.losses,
        X_train: np.ndarray,
        y_train: np.ndarray = None,
        optimizer: tf.keras.optimizers = tf.keras.optimizers.Adam(learning_rate=1e-3),
        loss_fn_kwargs: dict = None,
        epochs: int = 1000000,
        batch_size: int = 1,
        buffer_size: int = 2048,
        shuffle: bool = False,
        verbose: bool = True,
        show_model_interface_vector: bool = False
        ) -> None:

"""
Train TensorFlow model.

Parameters
----------
model
    Model to train.
loss_fn
    Loss function used for training.
X_train
    Training batch.
y_train
    Training labels.
optimizer
    Optimizer used for training.
loss_fn_kwargs
    Kwargs for loss function.
epochs
    Number of training epochs.
batch_size
    Batch size used for training.
buffer_size
    Maximum number of elements that will be buffered when prefetching.
shuffle
    Whether to shuffle training data.
verbose
    Whether to print training progress.
"""

model.show_interface_vector=show_model_interface_vector

# Create dataset
if y_train is None:  # Unsupervised model
    train_data = X_train
else:
    train_data = (X_train, y_train)
train_data = tf.data.Dataset.from_tensor_slices(train_data)
if shuffle:
    train_data = train_data.shuffle(buffer_size=buffer_size).batch(batch_size)

# Iterate over epochs
history=[]
for epoch in range(epochs):
    if verbose:
        pbar = tf.keras.utils.Progbar(target=epochs, width=40, verbose=1, interval=0.05)

    # Iterate over the batches of the dataset
    for step, train_batch in enumerate(train_data):

        if y_train is None:
            X_train_batch = train_batch
        else:
            X_train_batch, y_train_batch = train_batch

        with tf.GradientTape() as tape:
            preds = model(X_train_batch)

            if y_train is None:
                ground_truth = X_train_batch
            else:
                ground_truth = y_train_batch

            # Compute loss
            if tf.is_tensor(preds):
                args = [ground_truth, preds]
            else:
                args = [ground_truth] + list(preds)

            if loss_fn_kwargs:
                loss = loss_fn(*args, **loss_fn_kwargs)
            else:
                loss = loss_fn(*args)

            if model.losses:  # Additional model losses
                loss += sum(model.losses)

        grads = tape.gradient(loss, model.trainable_weights)
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

    if verbose:
            loss_val = loss.numpy().mean()
            pbar_values = [('loss', loss_val)]
            pbar.update(epoch+1, values=pbar_values)

    history.append(loss.numpy().mean())

model.show_interface_vector= not show_model_interface_vector
return history

非常训练后,我试图保存模型,但是当我调用TF2 .save时:

model.save('a.h5')

我有一个错误:

NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using `save_weights`.

我将其更改为.tf格式,但再次:

ValueError: Model <model2.NN object at 0x11448b390> cannot be saved because the input shapes have not been set. Usually, input shapes are automatically determined from calling .fit() or .predict(). To manually set the shapes, call model._set_inputs(inputs).

但它已经受过训练,并且如果我设置输入

ValueError: Cannot infer num from shape (None, 12, 4)

我不知道该怎么办? 我是科学和TF2爱好者 帮帮我,这对我的项目很重要...

3 个答案:

答案 0 :(得分:0)

第一个错误是您不能对子类模型使用model.save。子类化模型的类型为:您定义了一个继承自tf.keras.models.Model的类。如错误所示,请尝试使用functionalsequential api来构建模型,以便以h5格式保存。

答案 1 :(得分:0)

我找到了答案,只使用了:

import pickle
import dill

dill.dump(model, file = open("model.pickle", "wb"))
model_reloaded = dill.load(open("model.pickle", "rb"))

来自该主题: Saving an Object (Data persistence)

答案 2 :(得分:0)

如果您使用自定义的复杂模型,这意味着您需要创建优化器,计算梯度并将梯度应用于某些复杂零件,则tf.keras.Model.save不适合,尤其是当输入的形状未定义时(仅当I认为)。

因此tf.train.Checkpoint API适用于这种情况。单击教程的链接。 tf.train.Checkpoint可以同时保存模型和优化器。使用方式 tf.train.Checkpoint类似于使用tf.compat.v1.train.Saver的方式。因此,它可能是将代码从tensorflow1迁移到tensorflow2的替代方法。