Question

我正在Keras（tensorflow后端）构建一个简单的Sequential模型。在培训期间，我想检查各个培训批次和模型预测。因此，我正在尝试创建一个自定义Callback，以保存每个培训批次的模型预测和目标。但是，该模型不使用当前批次进行预测，而是使用整个训练数据。

如何才将当前的培训批次移交给Callback？

如何访问Callback在self.predhis和self.targets中保存的批次和目标？

我目前的版本如下：

callback_list = [prediction_history((self.x_train, self.y_train))]

self.model.fit(self.x_train, self.y_train, batch_size=self.batch_size, epochs=self.n_epochs, validation_data=(self.x_val, self.y_val), callbacks=callback_list)

class prediction_history(keras.callbacks.Callback):
    def __init__(self, train_data):
        self.train_data = train_data
        self.predhis = []
        self.targets = []

    def on_batch_end(self, epoch, logs={}):
        x_train, y_train = self.train_data
        self.targets.append(y_train)
        prediction = self.model.predict(x_train)
        self.predhis.append(prediction)
        tf.logging.info("Prediction shape: {}".format(prediction.shape))
        tf.logging.info("Targets shape: {}".format(y_train.shape))

Answer 1

注意：原始接受的答案错误，如评论中所指出的那样。由于它已被接受且无法删除，我已将其重写以提供可行的答案。

模型编译后，y_true的占位符张量位于model.targets，y_pred位于model.outputs。

要在每个批次中保存这些占位符的值，您可以：

首先将这些张量的值复制到变量中。
在on_batch_end中评估这些变量，并存储生成的数组。

现在步骤1涉及很多，因为您必须在训练函数tf.assign中添加model.train_function操作。使用当前的Keras API，可以通过在构造训练函数时向fetches提供K.function()参数来完成。

在model._make_train_function()中，有一行：

self.train_function = K.function(inputs,
                                 [self.total_loss] + self.metrics_tensors,
                                 updates=updates,
                                 name='train_function',
                                 **self._function_kwargs)

可以通过fetches提供包含tf.assign操作的model._function_kwargs参数（仅在Keras 2.1.0之后）

举个例子：

from keras.callbacks import Callback
from keras import backend as K
import tensorflow as tf

class CollectOutputAndTarget(Callback):
    def __init__(self):
        super(CollectOutputAndTarget, self).__init__()
        self.targets = []  # collect y_true batches
        self.outputs = []  # collect y_pred batches

        # the shape of these 2 variables will change according to batch shape
        # to handle the "last batch", specify `validate_shape=False`
        self.var_y_true = tf.Variable(0., validate_shape=False)
        self.var_y_pred = tf.Variable(0., validate_shape=False)

    def on_batch_end(self, batch, logs=None):
        # evaluate the variables and save them into lists
        self.targets.append(K.eval(self.var_y_true))
        self.outputs.append(K.eval(self.var_y_pred))

# build a simple model
# have to compile first for model.targets and model.outputs to be prepared
model = Sequential([Dense(5, input_shape=(10,))])
model.compile(loss='mse', optimizer='adam')

# initialize the variables and the `tf.assign` ops
cbk = CollectOutputAndTarget()
fetches = [tf.assign(cbk.var_y_true, model.targets[0], validate_shape=False),
           tf.assign(cbk.var_y_pred, model.outputs[0], validate_shape=False)]
model._function_kwargs = {'fetches': fetches}  # use `model._function_kwargs` if using `Model` instead of `Sequential`

# fit the model and check results
X = np.random.rand(10, 10)
Y = np.random.rand(10, 5)
model.fit(X, Y, batch_size=8, callbacks=[cbk])

除非样品数量可以除以批量大小，否则最终批次的大小将与其他批次不同。因此，在这种情况下无法使用K.variable()和K.update()。您必须使用tf.Variable(..., validate_shape=False)和tf.assign(..., validate_shape=False)代替。

要验证已保存数组的正确性，可以在training.py中添加一行以打印出混洗索引数组：

if shuffle == 'batch':
    index_array = _batch_shuffle(index_array, batch_size)
elif shuffle:
    np.random.shuffle(index_array)

print('Index array:', repr(index_array))  # Add this line

batches = _make_batches(num_train_samples, batch_size)

在拟合期间应该打印出混洗索引数组：

Epoch 1/1
Index array: array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2])
10/10 [==============================] - 0s 23ms/step - loss: 0.5670

您可以检查cbk.targets是否与Y[index_array]相同：

index_array = np.array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2])
print(Y[index_array])
[[ 0.75325592  0.64857277  0.1926653   0.7642865   0.38901153]
 [ 0.77567689  0.13573623  0.4902501   0.42897559  0.55825652]
 [ 0.33760938  0.68195038  0.12303088  0.83509441  0.20991668]
 [ 0.98367778  0.61325065  0.28973401  0.28734073  0.93399794]
 [ 0.26097574  0.88219054  0.87951941  0.64887846  0.41996446]
 [ 0.97794604  0.91307569  0.93816428  0.2125808   0.94381495]
 [ 0.74813435  0.08036688  0.38094272  0.83178364  0.16713736]
 [ 0.52609421  0.39218962  0.21022047  0.58569125  0.08012982]
 [ 0.61276627  0.20679494  0.24124858  0.01262245  0.0994412 ]
 [ 0.6026137   0.25620512  0.7398164   0.52558182  0.09955769]]

print(cbk.targets)
[array([[ 0.7532559 ,  0.64857274,  0.19266529,  0.76428652,  0.38901153],
        [ 0.77567691,  0.13573623,  0.49025011,  0.42897558,  0.55825651],
        [ 0.33760938,  0.68195039,  0.12303089,  0.83509439,  0.20991668],
        [ 0.9836778 ,  0.61325067,  0.28973401,  0.28734073,  0.93399793],
        [ 0.26097575,  0.88219053,  0.8795194 ,  0.64887846,  0.41996446],
        [ 0.97794604,  0.91307569,  0.93816429,  0.2125808 ,  0.94381493],
        [ 0.74813437,  0.08036689,  0.38094273,  0.83178365,  0.16713737],
        [ 0.5260942 ,  0.39218962,  0.21022047,  0.58569127,  0.08012982]], dtype=float32),
 array([[ 0.61276627,  0.20679495,  0.24124858,  0.01262245,  0.0994412 ],
        [ 0.60261369,  0.25620511,  0.73981643,  0.52558184,  0.09955769]], dtype=float32)]

如您所见，cbk.targets中有两个批次（一个＆＃34;完整批次＆＃34;大小为8，最后一批大小为2），行顺序与{相同{1}}。

Answer 2

更新：有关TF> = 2.2，请参见my other answer。

@ Yu-Yang解决方案的一个问题是它依赖于model._function_kwargs，由于它不是API的一部分，因此不能保证能正常工作。特别是在渴望执行的TF2中，会话争吵似乎根本不被接受，或者由于渴望模式而抢先运行。

因此，这是我的解决方案在tensorflow==2.1.0上进行了测试。诀窍是用Keras指标代替fetches，在训练过程中进行fetches的赋值操作。

如果批量大小除以样本数量，这甚至可以启用仅Keras的解决方案；否则，在初始化具有None形状的TensorFlow变量时，必须使用另一种技巧，类似于早期解决方案中的validate_shape=False（比较https://github.com/tensorflow/tensorflow/issues/35667）。

重要的是，tf.keras的行为与keras不同（有时只是忽略赋值，或将变量视为Keras符号张量），因此，此更新的解决方案同时照顾了两种实现（Keras==2.3.1和{ {1}}。

更新：该解决方案仍然适用于使用tensorflow==2.1.0的{{1}}。但是，由于tensorflow==2.2.0rc1不可用，因此我还无法使用Keras==2.3.1获取目标-使用未公开的API的痛苦。我的其他答案解决了这个问题。

tf.keras

Answer 3

从TF 2.2开始，您可以使用自定义训练步骤而不是回调来实现所需的功能。这是一个与static class Extensions { public static T2 Map<T1, T2>(this T1 obj, Func<T1, T2> func) { return func(obj); } public static T2 Map<T1, T2>(this Task<T1> obj, Func<Task<T1>, T2> func) { return func(obj); } }一起使用的演示，它使用继承来改进// This calls Func<T1, T2> 1.Map(x => x + 1); // This calls Func<Task<T1>, T2> Task.FromResult(1).Map(async _=> (await _).ToString()) // This calls Func<Task<T1>, T2> Task.FromResult(1).Map(_=> 1) // This calls Func<Task<T1>, T2>. // Cannot compile because Task<int> does not have operator '+'. Good indication. Task.FromResult(1).Map(x => x + 1)模型。在性能方面，这并不理想，因为要进行两次预测，一次在tensorflow==2.2.0rc1中，一次在keras.Sequential中。但是你明白了。

这可以在渴望模式下使用，并且不使用公共API，因此应该非常稳定。一个警告是您必须使用self(x, training=True)（独立的super().train_step(data)不支持tf.keras），但是我觉得独立的keras越来越被弃用。

Model.train_step

最后，这是一个不使用继承的非常相似的示例：

keras

Answer 4

灵感来自tf.keras.callbacks。TesnsorBoard保存v1（图形）摘要。

没有变量分配，也没有冗余指标。

要与tensorflow> = 2.0.0一起使用，请在评估过程中使用图形（禁用急切）模式。

可以通过覆盖SavePrediction._pred_callback来实现对numpy预测的大量操作。

import numpy as np
import tensorflow as tf
from tensorflow import keras

tf.compat.v1.disable_eager_execution()

in_shape = (2,)
out_shape = (1,)
batch_size = 2
n_samples = 32


class SavePrediction(keras.callbacks.Callback):
    def __init__(self):
        super().__init__()
        self._get_pred = None
        self.preds = []

    def _pred_callback(self, preds):
        self.preds.append(preds)

    def set_model(self, model):
        super().set_model(model)
        if self._get_pred is None:
            self._get_pred = self.model.outputs[0]

    def on_test_begin(self, logs):
        # pylint: disable=protected-access
        self.model._make_test_function()
        # pylint: enable=protected-access
        if self._get_pred not in self.model.test_function.fetches:
            self.model.test_function.fetches.append(self._get_pred)
            self.model.test_function.fetch_callbacks[self._get_pred] = self._pred_callback

    def on_test_end(self, logs):
        if self._get_pred in self.model.test_function.fetches:
            self.model.test_function.fetches.remove(self._get_pred)
        if self._get_pred in self.model.test_function.fetch_callbacks:
            self.model.test_function.fetch_callbacks.pop(self._get_pred)

        print(self.preds)


model = keras.Sequential([
    keras.layers.Dense(out_shape[0], input_shape=in_shape)
])
model.compile(loss="mse", optimizer="adam")

X = np.random.rand(n_samples, *in_shape)
Y = np.random.rand(n_samples, *out_shape)

model.evaluate(X, Y,
               batch_size=batch_size,
               callbacks=[SavePrediction()])

创建keras回调以在培训期间保存每个批次的模型预测和目标

4 个答案: