Question

我正在学习tensorflow并通过示例代码： https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/tf-keras

这是一个简短的代码段，显示了如何对model.fit函数进行输入。

def input_fn(dataset,shuffle, n_epoch,s_batch):
    if shuffle:
        dataset = dataset.shuffle(buffer_size=10000)
    dataset = dataset.repeat(n_epochs)
    dataset = dataset.batch(s_batch)
    return dataset

n_epoch=10
s_batch=100
s_samples=number of samples in the training data

training_dataset_input=input_fn(
    training_dataset,
    shuffle=True,
    num_epochs=n_epoch,
    batch_size=s_batch)

mymodel.fit(training_dataset_input,epochs=n_epoch,steps_per_epoch=int(s_samples/s_batch)) </i>

我的问题是了解纪元的工作原理。我认为一个纪元是整个数据集的一个完整运行周期。但是，当设置参数steps_per_epoch时，训练将在其在同一数据集中的位置继续进行，它似乎并没有在开始时重新开始。那么，

之间有什么区别？

mymodel.fit(training_dataset_input,epochs=n_epoch,steps_per_epoch=int(s_samples/s_batch))

并在一个时期内用尽整个复制的数据集

mymodel.fit(training_dataset_input)

这两种拟合方法都将使用整个数据集10次，并执行相同数量的训练步骤。

Answer 1

但是，当设置参数steps_per_epoch时，训练将在其在同一数据集中的位置继续进行，但似乎并没有在开始时重新开始。那么有什么区别

如果未设置steps_per_epoch，则1个纪元就是1个完整数据。

如果设置了steps_per_epoch，则将1个“ epoch”设置为训练步数，然后将该值设置为（如您所指出的），下一个“ epoch”将从上一个中断的位置开始。

如果您要在大型数据集上更频繁地执行验证运行等，此功能很有用。

复制tf.dataset时为什么要使用steps_per_epoch？

1 个答案: