我想保存模型并使用优化器状态加载它以进行再训练。我能够将模型权重保存为 .h5
文件,但优化器状态不佳。请帮帮我
答案 0 :(得分:0)
如果您使用 model.save()
作为 'h5'
对象保存模型,它会存储优化器状态以及重新启动训练过程所需的所有其他信息。
代码:
import numpy as np
import tensorflow as tf
x = np.random.uniform(0,1, (1000,32))
y = np.random.randint(0,2, (1000,))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(256, activation = 'relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))
model.compile(loss="sparse_categorical_crossentropy",
optimizer='adam',
metrics=['accuracy'])
def scheduler(epoch, lr):
if epoch < 1:
return lr
else:
return lr * tf.math.exp(-0.1)
callback = tf.keras.callbacks.LearningRateScheduler(scheduler, verbose = 1)
_ = model.fit(x= x, y = y, epochs = 25, validation_split=0.2, verbose = 1, callbacks=[callback])
输出:
Epoch 1/25
Epoch 00001: LearningRateScheduler reducing learning rate to 0.0010000000474974513.
25/25 [==============================] - 1s 24ms/step - loss: 0.6894 - accuracy: 0.5690 - val_loss: 0.6883 - val_accuracy: 0.5300
Epoch 2/25
Epoch 00002: LearningRateScheduler reducing learning rate to tf.Tensor(0.00090483745, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6853 - accuracy: 0.5378 - val_loss: 0.6843 - val_accuracy: 0.5650
Epoch 3/25
Epoch 00003: LearningRateScheduler reducing learning rate to tf.Tensor(0.0008187308, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6798 - accuracy: 0.5327 - val_loss: 0.6917 - val_accuracy: 0.5350
Epoch 4/25
Epoch 00004: LearningRateScheduler reducing learning rate to tf.Tensor(0.0007408183, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6733 - accuracy: 0.5744 - val_loss: 0.6848 - val_accuracy: 0.5550
Epoch 5/25
Epoch 00005: LearningRateScheduler reducing learning rate to tf.Tensor(0.0006703201, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6679 - accuracy: 0.6259 - val_loss: 0.6847 - val_accuracy: 0.5450
Epoch 6/25
Epoch 00006: LearningRateScheduler reducing learning rate to tf.Tensor(0.00060653075, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6613 - accuracy: 0.6176 - val_loss: 0.6890 - val_accuracy: 0.5450
Epoch 7/25
Epoch 00007: LearningRateScheduler reducing learning rate to tf.Tensor(0.00054881175, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6660 - accuracy: 0.6037 - val_loss: 0.6862 - val_accuracy: 0.5600
Epoch 8/25
Epoch 00008: LearningRateScheduler reducing learning rate to tf.Tensor(0.0004965854, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6635 - accuracy: 0.6162 - val_loss: 0.6886 - val_accuracy: 0.5600
Epoch 9/25
Epoch 00009: LearningRateScheduler reducing learning rate to tf.Tensor(0.00044932903, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6637 - accuracy: 0.5869 - val_loss: 0.6865 - val_accuracy: 0.5550
Epoch 10/25
Epoch 00010: LearningRateScheduler reducing learning rate to tf.Tensor(0.0004065697, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6583 - accuracy: 0.6218 - val_loss: 0.6883 - val_accuracy: 0.5700
Epoch 11/25
Epoch 00011: LearningRateScheduler reducing learning rate to tf.Tensor(0.0003678795, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6573 - accuracy: 0.5991 - val_loss: 0.6871 - val_accuracy: 0.5600
Epoch 12/25
Epoch 00012: LearningRateScheduler reducing learning rate to tf.Tensor(0.00033287113, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6497 - accuracy: 0.6228 - val_loss: 0.6876 - val_accuracy: 0.5650
Epoch 13/25
Epoch 00013: LearningRateScheduler reducing learning rate to tf.Tensor(0.00030119426, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6425 - accuracy: 0.6586 - val_loss: 0.6877 - val_accuracy: 0.5500
Epoch 14/25
Epoch 00014: LearningRateScheduler reducing learning rate to tf.Tensor(0.00027253185, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6424 - accuracy: 0.6579 - val_loss: 0.6878 - val_accuracy: 0.5650
Epoch 15/25
Epoch 00015: LearningRateScheduler reducing learning rate to tf.Tensor(0.00024659702, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6517 - accuracy: 0.6442 - val_loss: 0.6875 - val_accuracy: 0.5750
Epoch 16/25
Epoch 00016: LearningRateScheduler reducing learning rate to tf.Tensor(0.0002231302, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6401 - accuracy: 0.6753 - val_loss: 0.6886 - val_accuracy: 0.5650
Epoch 17/25
Epoch 00017: LearningRateScheduler reducing learning rate to tf.Tensor(0.00020189656, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6388 - accuracy: 0.6553 - val_loss: 0.6879 - val_accuracy: 0.5650
Epoch 18/25
Epoch 00018: LearningRateScheduler reducing learning rate to tf.Tensor(0.00018268357, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6441 - accuracy: 0.6505 - val_loss: 0.6889 - val_accuracy: 0.5650
Epoch 19/25
Epoch 00019: LearningRateScheduler reducing learning rate to tf.Tensor(0.00016529893, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6427 - accuracy: 0.6533 - val_loss: 0.6880 - val_accuracy: 0.5650
Epoch 20/25
Epoch 00020: LearningRateScheduler reducing learning rate to tf.Tensor(0.00014956866, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6434 - accuracy: 0.6330 - val_loss: 0.6886 - val_accuracy: 0.5650
Epoch 21/25
Epoch 00021: LearningRateScheduler reducing learning rate to tf.Tensor(0.00013533531, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6279 - accuracy: 0.7061 - val_loss: 0.6880 - val_accuracy: 0.5600
Epoch 22/25
Epoch 00022: LearningRateScheduler reducing learning rate to tf.Tensor(0.00012245646, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6368 - accuracy: 0.6492 - val_loss: 0.6883 - val_accuracy: 0.5700
Epoch 23/25
Epoch 00023: LearningRateScheduler reducing learning rate to tf.Tensor(0.000110803194, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6385 - accuracy: 0.6558 - val_loss: 0.6886 - val_accuracy: 0.5650
Epoch 24/25
Epoch 00024: LearningRateScheduler reducing learning rate to tf.Tensor(0.000100258876, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6323 - accuracy: 0.6689 - val_loss: 0.6884 - val_accuracy: 0.5600
Epoch 25/25
Epoch 00025: LearningRateScheduler reducing learning rate to tf.Tensor(9.0717986e-05, shape=(), dtype=float32).
25/25 [==============================] - 0s 3ms/step - loss: 0.6387 - accuracy: 0.6513 - val_loss: 0.6880 - val_accuracy: 0.5700
保存和加载模型并检查加载模型的lr:
model.save('mymodel.h5')
model1 = tf.keras.models.load_model('/content/mymodel.h5')
model1.optimizer.learning_rate
输出:
<tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=9.0717986e-05>
如上所示,输出日志中 lr 的最终值与加载模型的 lr 值匹配。
在重新启动模型时,您唯一需要记住的是为 initial_epoch
中的 model.fit()
参数提供一个值,因此所有依赖于 epoch 值进行计算的值(例如(上述情况下的 lr 调度程序))是计算正确。