使用keras的多个验证集

时间:2017-12-09 18:45:38

标签: validation keras monitoring

我正在使用model.fit()方法训练使用keras的模型。 我想使用多个验证集,这些验证集应该在每个训练时期之后单独验证,以便为每个验证集获得一个损失值。如果可能,它们应该在训练期间显示,并且由keras.callbacks.History()回调返回。

我在考虑这样的事情:

history = model.fit(train_data, train_targets,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_data=[
                        (validation_data1, validation_targets1), 
                        (validation_data2, validation_targets2)],
                    shuffle=True)

我目前不知道如何实现这一点。是否可以通过编写自己的Callback来实现这一目标?或者你怎么解决这个问题?

3 个答案:

答案 0 :(得分:15)

我最终根据Callback回调编写了自己的History来解决问题。我不确定这是否是最佳方法,但以下Callback记录了培训和验证集的损失和指标,如History回调,以及传递给其他验证集的损失和指标构造

class AdditionalValidationSets(Callback):
    def __init__(self, validation_sets, verbose=0, batch_size=None):
        """
        :param validation_sets:
        a list of 3-tuples (validation_data, validation_targets, validation_set_name)
        or 4-tuples (validation_data, validation_targets, sample_weights, validation_set_name)
        :param verbose:
        verbosity mode, 1 or 0
        :param batch_size:
        batch size to be used when evaluating on the additional datasets
        """
        super(AdditionalValidationSets, self).__init__()
        self.validation_sets = validation_sets
        for validation_set in self.validation_sets:
            if len(validation_set) not in [2, 3]:
                raise ValueError()
        self.epoch = []
        self.history = {}
        self.verbose = verbose
        self.batch_size = batch_size

    def on_train_begin(self, logs=None):
        self.epoch = []
        self.history = {}

    def on_epoch_end(self, epoch, logs=None):
        logs = logs or {}
        self.epoch.append(epoch)

        # record the same values as History() as well
        for k, v in logs.items():
            self.history.setdefault(k, []).append(v)

        # evaluate on the additional validation sets
        for validation_set in self.validation_sets:
            if len(validation_set) == 3:
                validation_data, validation_targets, validation_set_name = validation_set
                sample_weights = None
            elif len(validation_set) == 4:
                validation_data, validation_targets, sample_weights, validation_set_name = validation_set
            else:
                raise ValueError()

            results = self.model.evaluate(x=validation_data,
                                          y=validation_targets,
                                          verbose=self.verbose,
                                          sample_weight=sample_weights,
                                          batch_size=self.batch_size)

            for i, result in enumerate(results):
                if i == 0:
                    valuename = validation_set_name + '_loss'
                else:
                    valuename = validation_set_name + '_' + self.model.metrics[i-1].__name__
                self.history.setdefault(valuename, []).append(result)

我正在使用这样的:

history = AdditionalValidationSets([(validation_data2, validation_targets2, 'val2')])
model.fit(train_data, train_targets,
          epochs=epochs,
          batch_size=batch_size,
          validation_data=(validation_data1, validation_targets1),
          callbacks=[history]
          shuffle=True)

答案 1 :(得分:1)

我在TensorFlow 2上对此进行了测试,并且可以正常工作。在每个时期结束时,您可以根据需要评估任意数量的验证集:

class MyCustomCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        res_eval_1 = self.model.evaluate(X_test_1, y_test_1, verbose = 0)
        res_eval_2 = self.model.evaluate(X_test_2, y_test_2, verbose = 0)
        print(res_eval_1)
        print(res_eval_2)

后来:

my_val_callback = MyCustomCallback()
# Your model creation code
model.fit(..., callbacks=[my_val_callback])

答案 2 :(得分:0)

考虑到当前的keras docs,您可以将回调传递给evaluateevaluate_generator。因此,您可以使用不同的数据集多次调用evaluate

我尚未对其进行测试,因此,如果您在下面评论您的使用经验,我感到很高兴。