我试图运行一个训练循环,我会定期确定当前的平均损失并将其打印到控制台。为了确定损失,我想使用不同的批量大小。所以它是这样的:
dataset = create_dataset().shuffle(1000).repeat().batch(minibatch_size)
iterator = dataset.make_one_shot_iterator() # using this iterator in the graph
while ...:
session.run(...) # perform training
if epoch % 10 = 0:
test_avg_loss = session.run(avg_loss) # want a different number of items here
我希望在培训期间使用10个小批量的小批量,但我想测试100个数据点以获得更好的平均损失估算值。如何使数据集在此处返回不同数量的项目?我尝试将placeholder
传递给batch
但似乎不受支持。错误是:
' ValueError:无法按值捕获占位符(名称:batchSize,类型:占位符)。'
如果这似乎是一个更好的解决方案,我可以完全使用不同的代码结构。我理解,出于性能原因,不使用feedDict
传递数据非常重要,因此使用dataset
似乎是可行的方法。我不是在寻求某种黑客攻击,但我想知道这样做的正确方法。
答案 0 :(得分:1)
一个好的解决方案是使用可重新初始化的迭代器,它允许您在两个(或更多)Dataset
之间切换,通常一个用于训练,一个用于验证。
The example in the documentation实际上非常简洁:
# Define training and validation datasets with the same structure.
training_dataset = tf.data.Dataset.range(100).map(
lambda x: x + tf.random_uniform([], -10, 10, tf.int64))
validation_dataset = tf.data.Dataset.range(50)
# A reinitializable iterator is defined by its structure. We could use the
# `output_types` and `output_shapes` properties of either `training_dataset`
# or `validation_dataset` here, because they are compatible.
iterator = tf.data.Iterator.from_structure(training_dataset.output_types,
training_dataset.output_shapes)
next_element = iterator.get_next()
training_init_op = iterator.make_initializer(training_dataset)
validation_init_op = iterator.make_initializer(validation_dataset)
# Run 20 epochs in which the training dataset is traversed, followed by the
# validation dataset.
for _ in range(20):
# Initialize an iterator over the training dataset.
sess.run(training_init_op)
for _ in range(100):
sess.run(next_element)
# Initialize an iterator over the validation dataset.
sess.run(validation_init_op)
for _ in range(50):
sess.run(next_element)
请确保您创建的迭代器具有未知的批量大小。
答案 1 :(得分:1)
根据您的评论,您应该查看可以与feedable iterator
一起使用的tf.placeholder
,以便通过熟悉的feed_dict机制选择每次调用tf.Session.run时使用的迭代器。它提供与reinitializable iterator
相同的功能,但是当您在迭代器之间切换时,它不需要您从数据集的开头初始化迭代器。
# Training and validation datasets
training_dataset = tf.data.Dataset.range(100).repeat().batch(100)
validation_dataset = tf.data.Dataset.range(150, 200).repeat().batch(10)
# A feedable iterator to toggle between validation and training dataset
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(
handle, training_dataset.output_types, training_dataset.output_shapes)
next_element = iterator.get_next()
training_iterator = training_dataset.make_one_shot_iterator()
validation_iterator = validation_dataset.make_one_shot_iterator()
with tf.Session() as sess:
# The `Iterator.string_handle()` method returns a tensor that can be evaluated
# and used to feed the `handle` placeholder.
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
# Run 20 epochs in which the training dataset is traversed, followed by the
# validation dataset.
for _ in range(20):
for _ in range(100):
out = sess.run(next_element, feed_dict={handle: training_handle})
for _ in range(50):
out = sess.run(next_element, feed_dict={handle: validation_handle})
答案 2 :(得分:0)
使用[无,无]
整形占位符现在在评估和培训期间做这样的事情:
为您的培训文件提供结构:
import tensorflow as tf
def shape(dataset):
#shape your data here
return {'input':np.array(input_data),'label':np.array(labels)}
def evaluate(model,batch_size=100):
sess = tf.get_default_graph()
iteration = len(dataset) // batch_size
loss = []
for j in iteration:
dataset = dataset[j * batch_size:(j + 1) * batch_size]
#shape it here before feeding to network
dataset=shape(dataset)
out = sess.run(model, feed_dict={input_place: dataset['input'], labels: data['labels']})
loss.append(out['loss'])
return np.mean(loss)
def train(model,batch_size=10):
iteration=len(dataset)//batch_size
with tf.Session() as sess:
for i in epoch(epoch):
for j in iteration:
dataset = dataset[j * batch_size:(j + 1) * batch_size]
dataset = shape(dataset)
# shape it here before feeding to network
out = sess.run(model, feed_dict={input_place: dataset['input'], labels: data['labels']})
print(out['loss'], out['training_accuracy'])
print(evaluate(model))