解决方案：

Question

我想在训练期间替换与我的神经网络相关的损失函数，这是网络：

model = tensorflow.keras.models.Sequential()
        model.add(tensorflow.keras.layers.Conv2D(32, kernel_size=(3, 3), activation="relu", input_shape=input_shape))
        model.add(tensorflow.keras.layers.Conv2D(64, (3, 3), activation="relu"))
        model.add(tensorflow.keras.layers.MaxPooling2D(pool_size=(2, 2)))
        model.add(tensorflow.keras.layers.Dropout(0.25))
        model.add(tensorflow.keras.layers.Flatten())
        model.add(tensorflow.keras.layers.Dense(128, activation="relu"))
        model.add(tensorflow.keras.layers.Dropout(0.5))
        model.add(tensorflow.keras.layers.Dense(output_classes, activation="softmax"))
        model.compile(loss=tensorflow.keras.losses.categorical_crossentropy, optimizer=tensorflow.keras.optimizers.Adam(0.001), metrics=['accuracy'])
        history = model.fit(x_train, y_train, batch_size=128, epochs=5, validation_data=(x_test, y_test))

所以现在我想将tensorflow.keras.losses.categorical_crossentropy换成另一个，所以我做了这个：

model.compile(loss=tensorflow.keras.losses.mse, optimizer=tensorflow.keras.optimizers.Adam(0.001), metrics=['accuracy'])
    history = model.fit(x_improve, y_improve, epochs=1, validation_data=(x_test, y_test)) #FIXME bug during training

但是我有这个错误：

ValueError: No gradients provided for any variable: ['conv2d/kernel:0', 'conv2d/bias:0', 'conv2d_1/kernel:0', 'conv2d_1/bias:0', 'dense/kernel:0', 'dense/bias:0', 'dense_1/kernel:0', 'dense_1/bias:0'].

为什么？我该如何解决？还有另一种更改损失函数的方法吗？

谢谢

Answer 1

因此，我要给出的一个简单答案是：如果您想玩这种游戏，请切换到pytorch。由于在pytorch中您定义了训练和评估功能，因此只需一个if语句即可从损失函数切换到另一个函数。

此外，我在您的代码中看到要从cross_entropy切换到mean_square_error，前者适合分类，后者适合回归，因此这并不是您真正可以做的，在下面的代码中，我从均值切换为平方误差等于对数平方误差，均适合于回归。

尽管有其他答案，也可以为您的问题提供解决方案（请参见change-loss-function-dynamically-during-training），但不确定您是否可以信任结果。有人发现，即使Keras具有自定义功能，有时也会保持训练的第一损失。

解决方案：

我的解决方案基于train_on_batch，它使我们能够在for循环中训练模型，因此每当我们希望使用新的损失函数重新编译模型时，就停止训练它。请注意，重新编译模型不会重置权重（请参阅：Does recompiling a model re-initialize the weights?）。

数据集可以在这里Boston housing dataset

# Regression Example With Boston Dataset: Standardized and Larger
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from keras.losses import mean_squared_error, mean_squared_logarithmic_error
from matplotlib import pyplot
import matplotlib.pyplot as plt

# load dataset
dataframe = read_csv("housing.csv", delim_whitespace=True, header=None)
dataset = dataframe.values

# split into input (X) and output (Y) variables
X = dataset[:,0:13]
y = dataset[:,13]

trainX, testX, trainy, testy = train_test_split(X, y, test_size=0.33, random_state=42)

# create model
model = Sequential()
model.add(Dense(13, input_dim=13, kernel_initializer='normal', activation='relu'))
model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))

batch_size = 25

# have to define manually a dict to store all epochs scores 
history = {}
history['history'] = {}
history['history']['loss'] = []
history['history']['mean_squared_error'] = []
history['history']['mean_squared_logarithmic_error'] = []
history['history']['val_loss'] = []
history['history']['val_mean_squared_error'] = []
history['history']['val_mean_squared_logarithmic_error'] = []

# first compiling with mse
model.compile(loss='mean_squared_error', optimizer='adam', metrics=[mean_squared_error, mean_squared_logarithmic_error])

# define number of iterations in training and test
train_iter = round(trainX.shape[0]/batch_size)
test_iter = round(testX.shape[0]/batch_size)

for epoch in range(2):
    
    # train iterations 
    loss, mse, msle = 0, 0, 0
    for i in range(train_iter):
        
        start = i*batch_size
        end = i*batch_size + batch_size
        batchX = trainX[start:end,]
        batchy = trainy[start:end,]
        
        loss_, mse_, msle_ = model.train_on_batch(batchX,batchy)
                
        loss += loss_
        mse += mse_
        msle += msle_
    
    history['history']['loss'].append(loss/train_iter)
    history['history']['mean_squared_error'].append(mse/train_iter)
    history['history']['mean_squared_logarithmic_error'].append(msle/train_iter)
    
    # test iterations 
    val_loss, val_mse, val_msle = 0, 0, 0
    for i in range(test_iter):
        
        start = i*batch_size
        end = i*batch_size + batch_size
        batchX = testX[start:end,]
        batchy = testy[start:end,]
        
        val_loss_, val_mse_, val_msle_ = model.test_on_batch(batchX,batchy)
        
        val_loss += val_loss_
        val_mse += val_mse_
        val_msle += msle_
        
    history['history']['val_loss'].append(val_loss/test_iter)
    history['history']['val_mean_squared_error'].append(val_mse/test_iter)
    history['history']['val_mean_squared_logarithmic_error'].append(val_msle/test_iter)
    
# recompiling the model with new loss
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=[mean_squared_error, mean_squared_logarithmic_error])

for epoch in range(2):
    
    # train iterations 
    loss, mse, msle = 0, 0, 0
    for i in range(train_iter):
        
        start = i*batch_size
        end = i*batch_size + batch_size
        batchX = trainX[start:end,]
        batchy = trainy[start:end,]
    
        loss_, mse_, msle_ = model.train_on_batch(batchX,batchy)
        
        loss += loss_
        mse += mse_
        msle += msle_
        
    history['history']['loss'].append(loss/train_iter)
    history['history']['mean_squared_error'].append(mse/train_iter)
    history['history']['mean_squared_logarithmic_error'].append(msle/train_iter)
     
    # test iterations 
    val_loss, val_mse, val_msle = 0, 0, 0
    for i in range(test_iter):
        
        start = i*batch_size
        end = i*batch_size + batch_size
        batchX = testX[start:end,]
        batchy = testy[start:end,]
        
        val_loss_, val_mse_, val_msle_ = model.test_on_batch(batchX,batchy)
        
        val_loss += val_loss_
        val_mse += val_mse_
        val_msle += msle_
        
    history['history']['val_loss'].append(val_loss/test_iter)
    history['history']['val_mean_squared_error'].append(val_mse/test_iter)
    history['history']['val_mean_squared_logarithmic_error'].append(val_msle/test_iter)
    
# Some plots to check what is going on   
# loss function 
pyplot.subplot(311)
pyplot.title('Loss')
pyplot.plot(history['history']['loss'], label='train')
pyplot.plot(history['history']['val_loss'], label='test')
pyplot.legend()

# Only mean squared error 
pyplot.subplot(312)
pyplot.title('Mean Squared Error')
pyplot.plot(history['history']['mean_squared_error'], label='train')
pyplot.plot(history['history']['val_mean_squared_error'], label='test')
pyplot.legend()

# Only mean squared logarithmic error 
pyplot.subplot(313)
pyplot.title('Mean Squared Logarithmic Error')
pyplot.plot(history['history']['mean_squared_logarithmic_error'], label='train')
pyplot.plot(history['history']['val_mean_squared_logarithmic_error'], label='test')
pyplot.legend()
plt.tight_layout()
pyplot.show()

结果图确认损失函数在第二个时期之后发生了变化：

损失函数的下降是由于以下事实：模型从正常均方误差切换到对数误差，对数值误差要低得多。打印分数也证明所使用的损失确实发生了变化：

print(history['history']['loss'])
[599.5209197998047, 570.4041115897043, 3.8622902120862688, 2.1578191178185597]
print(history['history']['mean_squared_error'])
[599.5209197998047, 570.4041115897043, 510.29034205845426, 425.32058388846264]
print(history['history']['mean_squared_logarithmic_error'])
[8.624503476279122, 6.346359729766846, 3.8622902120862688, 2.1578191178185597]

在前两个时期中，损失的值等于mean_square_error的值，在第三和第四个时期中，损失的值等于mean_square_logarithmic_error的值，这是设置的新损失。因此，似乎使用train_on_batch可以更改损失函数，不过我想再次强调一点，这基本上是在pytoch上要做的操作以达到相同的结果，区别在于pytorch的行为（在这种情况下和在我意见）更可靠。

Answer 2

我目前正在使用Tensorflow和Keras在google colab上工作，每次我重新编译这样的模型时，我都无法重新编译具有权重的模型：

with strategy.scope():
  model = hd_unet_model(INPUT_SIZE)
  model.compile(optimizer=Adam(lr=0.01), 
                loss=tf.keras.losses.MeanSquaredError() ,
                metrics=[tf.keras.metrics.MeanSquaredError()])

权重被重置。所以我找到了另一种解决方案，您所需要做的就是：

使用所需的权重获取模型（加载模型或其他东西）
像这样获得模型的权重：

weights = model.get_weights()

重新编译模型（以更改损失函数）
再次设置重新编译的模型的权重，如下所示：

model.set_weights(weights)

启动培训

我测试了这种方法，它似乎有效。

因此要在培训中期更改损失，您可以：

首先损失。
第一次损失的火车。
保存权重。
第二次丢失时重新编译。
加载砝码。
训练第二次失败。

如何在训练tensorflow.keras期间替换损失函数

2 个答案:

解决方案：