如何在训练和验证数据集之间切换模型迭代器?

时间:2019-01-30 22:34:49

标签: tensorflow iterator tensorflow-datasets

我正在学习TensorFlow“下层API”,您可以在其中使用tf.layers手动指定图层,创建数据集和迭代器,并运行循环以训练和验证模型。我正在尝试进行培训和验证。不幸的是,尝试在训练和验证数据集之间切换时遇到错误:

这就是我所拥有的:

self.train_it = \
    train_dataset.batch(self.batch_size).make_initializable_iterator()
self.validate_it = \
    train_dataset.batch(self.batch_size).make_initializable_iterator()

...

input_layer = self.train_it.get_next()[0]
hidden1 = tf.layers.dense(
    input_layer,
    ... )

...

with tf.name_scope('train'):
  self.train_op = \
        tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(self.loss)

...

for epo in range(epochs):
  # Train using self.train_it iterator.
  self.sess.run(self.train_it.initializer)
  total_loss = 0
  for iteration in range(n_batches):
    summary, _, batch_loss = self.sess.run([self.merged_summary, \
        self.train_op, self.loss])
    total_loss += batch_loss
  print('   Epoch : {}/{}, Training loss = {:.4f}'. \
            format(epo+1, epochs, total_loss / n_batches))
  # Validate using self.valid_it iterator.
  self.sess.run(self.validate_it.initializer)
  # HOW DO I TELL THE MODEL TO USE self.valid_it INSTEAD OF self.train_it ???

这里的问题是,一开始我已经告诉模型使用train_itinput_layer = self.train_it.get_next()[0],现在我不得不告诉它在train_it和{{1}之间切换}每个时代。我一定在API中缺少有关如何做到这一点的东西。

1 个答案:

答案 0 :(得分:1)

我将使用可重新初始化的迭代器并执行以下操作。

train_dataset = train_dataset.batch(batch_size_train)
val_dataset = validation_dataset.batch(batch_size_val)

iterator = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)

train_init_op = iterator.make_initializer(train_dataset)
val_init_op = iterator.make_initializer(val_dataset)

data, labels = iterator.get_next()

然后链接数据和模型中的标签。然后在训练时执行以下操作:

for e in range(epochs):
    sess.run(train_init_op)
    for iteration in range(n_batches_val):
        ....
    sess.run(val_init_op)
    for iteration in range(n_batches_val):
        ....

如果您感到困惑,请告诉我。