假设我通过Inception转学。我添加了几层并训练了一段时间。
以下是我的模型拓扑结构:
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu', name = 'Dense_1')(x)
predictions = Dense(12, activation='softmax', name = 'Predictions')(x)
model = Model(input=base_model.input, output=predictions)
我训练这个模型一段时间,保存并重新加载以进行再训练;这次我想在Dense_1
添加l2-regularizer而不重置权重?这可能吗?
path = .\model.hdf5
from keras.models import load_model
model = load_model(path)
文档显示只显示初始化图层时可以将正则化程序添加为参数:
from keras import regularizers
model.add(Dense(64, input_dim=64,
kernel_regularizer=regularizers.l2(0.01),
activity_regularizer=regularizers.l1(0.01)))
这主要是创建一个新图层,因此我的图层权重将被重置。
编辑:
所以我在过去的几天里玩这些代码,当我加载模型时(在使用新的正规化器训练一下之后)我的损失发生了一些奇怪的事情。
所以我第一次运行这段代码(第一次使用新的正规化器):
from keras.models import load_model
base_model = load_model(path)
x = base_model.get_layer('dense_1').output
predictions = base_model.get_layer('dense_2')(x)
model = Model(inputs = base_model.input, output = predictions)
model.get_layer('dense_1').kernel_regularizer = regularizers.l2(0.02)
model.compile(optimizer=SGD(lr= .0001, momentum=0.90),
loss='categorical_crossentropy',
metrics = ['accuracy'])
我的训练输出似乎正常:
Epoch 43/50
- 2918s - loss: 0.3834 - acc: 0.8861 - val_loss: 0.4253 - val_acc: 0.8723
Epoch 44/50
Epoch 00044: saving model to E:\Keras Models\testing_3\2018-01-18_44.hdf5
- 2692s - loss: 0.3781 - acc: 0.8869 - val_loss: 0.4217 - val_acc: 0.8729
Epoch 45/50
- 2690s - loss: 0.3724 - acc: 0.8884 - val_loss: 0.4169 - val_acc: 0.8748
Epoch 46/50
Epoch 00046: saving model to E:\Keras Models\testing_3\2018-01-18_46.hdf5
- 2684s - loss: 0.3688 - acc: 0.8896 - val_loss: 0.4137 - val_acc: 0.8748
Epoch 47/50
- 2665s - loss: 0.3626 - acc: 0.8908 - val_loss: 0.4097 - val_acc: 0.8763
Epoch 48/50
Epoch 00048: saving model to E:\Keras Models\testing_3\2018-01-18_48.hdf5
- 2681s - loss: 0.3586 - acc: 0.8924 - val_loss: 0.4069 - val_acc: 0.8767
Epoch 49/50
- 2679s - loss: 0.3549 - acc: 0.8930 - val_loss: 0.4031 - val_acc: 0.8776
Epoch 50/50
Epoch 00050: saving model to E:\Keras Models\testing_3\2018-01-18_50.hdf5
- 2680s - loss: 0.3493 - acc: 0.8950 - val_loss: 0.4004 - val_acc: 0.8787
但是,如果我尝试在这个迷你训练课程之后加载模型(我将从epoch 00050加载模型,那么新的正则化值应该已经实现,我得到一个非常高的损失值)
代码:
path = r'E:\Keras Models\testing_3\2018-01-18_50.hdf5' #50th epoch model
from keras.models import load_model
model = load_model(path)
model.compile(optimizer=SGD(lr= .0001, momentum=0.90),
loss='categorical_crossentropy',
metrics = ['accuracy'])
返回:
Epoch 51/65
- 3130s - loss: 14.0017 - acc: 0.8953 - val_loss: 13.9529 - val_acc: 0.8800
Epoch 52/65
Epoch 00052: saving model to E:\Keras Models\testing_3\2018-01-20_52.hdf5
- 2813s - loss: 13.8017 - acc: 0.8969 - val_loss: 13.7553 - val_acc: 0.8812
Epoch 53/65
- 2759s - loss: 13.6070 - acc: 0.8977 - val_loss: 13.5609 - val_acc: 0.8824
Epoch 54/65
Epoch 00054: saving model to E:\Keras Models\testing_3\2018-01-20_54.hdf5
- 2748s - loss: 13.4115 - acc: 0.8992 - val_loss: 13.3697 - val_acc: 0.8824
Epoch 55/65
- 2745s - loss: 13.2217 - acc: 0.9006 - val_loss: 13.1807 - val_acc: 0.8840
Epoch 56/65
Epoch 00056: saving model to E:\Keras Models\testing_3\2018-01-20_56.hdf5
- 2752s - loss: 13.0335 - acc: 0.9014 - val_loss: 12.9951 - val_acc: 0.8840
Epoch 57/65
- 2756s - loss: 12.8490 - acc: 0.9023 - val_loss: 12.8118 - val_acc: 0.8849
Epoch 58/65
Epoch 00058: saving model to E:\Keras Models\testing_3\2018-01-20_58.hdf5
- 2749s - loss: 12.6671 - acc: 0.9032 - val_loss: 12.6308 - val_acc: 0.8849
Epoch 59/65
- 2738s - loss: 12.4871 - acc: 0.9039 - val_loss: 12.4537 - val_acc: 0.8855
Epoch 60/65
Epoch 00060: saving model to E:\Keras Models\testing_3\2018-01-20_60.hdf5
- 2765s - loss: 12.3086 - acc: 0.9059 - val_loss: 12.2778 - val_acc: 0.8868
Epoch 61/65
- 2767s - loss: 12.1353 - acc: 0.9065 - val_loss: 12.1055 - val_acc: 0.8867
Epoch 62/65
Epoch 00062: saving model to E:\Keras Models\testing_3\2018-01-20_62.hdf5
- 2757s - loss: 11.9637 - acc: 0.9061 - val_loss: 11.9351 - val_acc: 0.8883
请注意真正高loss
的值。这是正常的吗?我理解l2正则化器会带来损失(如果有很大的权重),但是这不会反映在第一个小型训练课程中(我首先实现了正规化器吗?)。但准确性似乎保持一致。
谢谢。
答案 0 :(得分:7)
你需要做两件事:
按以下方式添加正规化器:
model.get_layer('Dense_1').kernel_regularizer = l2(0.01)
重新编译模型:
model.compile(...)
答案 1 :(得分:4)
对于tensorflow 2.2,您只需要这样做:
l2 = tf.keras.regularizers.l2(1e-4)
for layer in model.layers:
# if hasattr(layer, 'kernel'):
# or
# If you want to apply just on Conv
if isinstance(layer, tf.keras.layers.Conv2D):
model.add_loss(lambda layer=layer: l2(layer.kernel))
希望这会有所帮助
答案 2 :(得分:2)
Marcin的解决方案对我没有用。如apatsekin所述,如果您按照建议的Marcin添加正则化后打印layer.losses
,则会得到一个空列表。
我找到了一个我完全不喜欢的解决方法,但是我在这里发布,以便更有能力的人可以找到一种更简单的方法来实现此目的。
我相信它适用于大多数keras.application
网络。我将特定体系结构的.py
文件从Github中的keras-application(例如InceptionResNetV2)复制到了机器中的本地文件regularizedNetwork.py
中。我必须对其进行编辑以修复一些相对的导入,例如:
#old version
from . import imagenet_utils
from .imagenet_utils import decode_predictions
from .imagenet_utils import _obtain_input_shape
backend = None
layers = None
models = None
keras_utils = None
收件人:
#new version
from keras import backend
from keras import layers
from keras import models
from keras import utils as keras_utils
from keras.applications import imagenet_utils
from keras.applications.imagenet_utils import decode_predictions
from keras.applications.imagenet_utils import _obtain_input_shape
解决相对路径和导入问题后,就像在定义新的未经训练的网络时所做的那样,我在每个所需的层中添加了正则化器。通常,在定义架构后,keras.application
中的模型会加载预先训练的权重。
现在,在您的主要代码/笔记本中,只需导入新的regularizedNetwork.py
并调用main方法即可实例化网络。
#main code
from regularizedNetwork import InceptionResNetV2
所有正则化器都应该设置好,您可以正常地微调正则化模型。
我敢肯定,这样做的花招少了,所以,如果有人发现,请写一个新答案和/或对此答案发表评论。
仅作记录,我还尝试从keras.application
实例化模型,使用regModel = model.get_config()
获取其体系结构,按Marcin的建议添加正则化器,然后使用{{ 1}},但仍然无法正常工作。
编辑:拼写错误。
答案 3 :(得分:1)
有点黑,但应该可以。这适用于Tensorflow 2.0中的预训练模型。请注意,所有图层都应位于model.layers
中,即将跳过嵌套的加权图层。从这里https://sthalles.github.io/keras-regularizer/
import os
import tempfile
def add_regularization(model, regularizer=tf.keras.regularizers.l2(0.0001)):
if not isinstance(regularizer, tf.keras.regularizers.Regularizer):
print("Regularizer must be a subclass of tf.keras.regularizers.Regularizer")
return model
for layer in model.layers:
for attr in ['kernel_regularizer']:
if hasattr(layer, attr):
setattr(layer, attr, regularizer)
# When we change the layers attributes, the change only happens in the model config file
model_json = model.to_json()
# Save the weights before reloading the model.
tmp_weights_path = os.path.join(tempfile.gettempdir(), 'tmp_weights.h5')
model.save_weights(tmp_weights_path)
# load the model from the config
model = tf.keras.models.model_from_json(model_json)
# Reload the model weights
model.load_weights(tmp_weights_path, by_name=True)
return model
答案 4 :(得分:0)
尝试一下:
# a utility function to add weight decay after the model is defined.
def add_weight_decay(model, weight_decay):
if (weight_decay is None) or (weight_decay == 0.0):
return
# recursion inside the model
def add_decay_loss(m, factor):
if isinstance(m, tf.keras.Model):
for layer in m.layers:
add_decay_loss(layer, factor)
else:
for param in m.trainable_weights:
with tf.keras.backend.name_scope('weight_regularizer'):
regularizer = lambda: tf.keras.regularizers.l2(factor)(param)
m.add_loss(regularizer)
# weight decay and l2 regularization differs by a factor of 2
add_decay_loss(model, weight_decay/2.0)
return
答案 5 :(得分:0)
在Horovod示例中找到解决方法。这个想法是对模型进行序列化,添加L2,然后将其还原。
model_config = model.get_config()
for layer, layer_config in zip(model.layers, model_config['layers']):
if hasattr(layer, 'kernel_regularizer'):
regularizer = keras.regularizers.l2(args.wd)
layer_config['config']['kernel_regularizer'] = \
{'class_name': regularizer.__class__.__name__,
'config': regularizer.get_config()}
if type(layer) == keras.layers.BatchNormalization:
layer_config['config']['momentum'] = 0.9
layer_config['config']['epsilon'] = 1e-5
model = keras.models.Model.from_config(model_config)