Tensorflow:使用训练和验证集的正确排队/批处理结构

时间:2017-05-10 19:36:37

标签: tensorflow batch-processing queueing

我试图从最近的2017年开发峰会(code found here)复制TensorBoard MNIST示例中使用的结构。其中, feed_dict 用于在训练集和验证集之间交替;然而,他们使用非常不透明的mnist.train.next_batch,这使得你很难迭代自己。

不可否认,这也可能是因为我在努力理解Tensorflow中的排队实现,并且显式示例似乎供不应求,特别是对于TF> V1.0。

我根据我偶然发现的各种例子,自己尝试了对CNN进行图像分类的尝试。最初我通过将数据存储在预加载的变量(它是一个小数据集)中来处理数据。我假设通过从文件名中提供数据来让火车/有效交换工作变得更容易,所以我尝试将其更改为。

在更改格式和尝试实现feed_dict train / valid结构之间,我得到以下内容 -

错误:“您必须使用dtype字符串”为占位符张量'输入/ Placeholder_2'提供值。

有关如何使其工作的任何提示或有关sliceer / train.batch / QueueRunner如何实际协同工作的进一步解释将会有很大帮助,因为我发现Tensorflow教程在解释他们之间的基本工作流

我有一种感觉,我在完全错误的地方有train.batch,它可能应该在feed_dict def中,但不知道其他情况。谢谢!

import tensorflow as tf
from tensorflow.python.framework import dtypes

# Input - 216x216x1 images; ~900 training images, ~350 validation
# Want to do batches of 5 for training, 20 for validation

learn_rate = .0001
drop_keep = 0.9
train_batch = 5
test_batch = 20
epochs = 1
iterations = int((885/train_batch) * epochs)        

#
#
# A BUNCH OF (graph-building) HELPER DEFINITIONS EXCLUDED FOR BREVITY
#
#




#x_init will be fed a list of .jpg filenames (ex: [/file0.jpg, /file1.jpg, ...])
#y_init will be fed an array of one-hot classes (ex: [[0,1,0], [1,0,0], ...])

sess = tf.InteractiveSession()

with tf.name_scope('input'):
    batch_size = tf.placeholder(tf.int32)
    keep_prob = tf.placeholder(tf.float32)
    x_init = tf.placeholder(dtype=tf.string, shape=(None))
    y_init = tf.placeholder(dtype=np.int32, shape=(None,3)) #3 classes

    image, label = tf.train.slice_input_producer([x_init, y_init])
    file = tf.read_file(image)
    image = tf.image.decode_jpeg(file, channels=1)
    image = tf.cast(image, tf.float32)
    image.set_shape([216,216,1])
    label = tf.cast(label, tf.int32)
    images, labels = tf.train.batch([image, label], batch_size=batch_size)



conv1 = conv_layer(images, [5,5,1], 40, 'conv1')
#
#
# skip the rest of graph defining/functions (merged,train_step)
# very similar to what is found in the MNIST example.
#
#
tf.summary.scalar('accuracy', accuracy)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(OUTPUT_LOC + '/train',sess.graph)
test_writer = tf.summary.FileWriter(OUTPUT_LOC + '/test')

sess.run(tf.global_variables_initializer())



#xTrain, yTrain, xTest, yTest are the train/valid images/labels lists
def feed_dict(train=True):
    if train:
        batch = train_batch
        keep = drop_keep
        xval = xTrain
        yval = yTrain
    else:
        batch = test_batch
        keep = 1
        xval = xTest
        yval = yTest
    return({x_init:xval, y_init:yval, batch_size:batch, keep_prob:keep})



#If I run "threads", I get the error. It works up until here.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)



#Don't know what works here or what doesn't.
for i in range(iterations):
    if i % 10 == 0:
        summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
        test_writer.add_summary(summary, i)
        print('Accuracy at step %s: %s' % (i, acc))
    else:
        if i % 100 == 99:
            run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
            run_metadata = tf.RunMetadata()
            summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True), options=run_options, run_metadata=run_metadata)
            train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
            train_writer.add_summary(summary, i)
            print('Adding run metadata for', i)
        else:  # Record a summary
            summary, _ = sess.run([merged, train_step],feed_dict=feed_dict(True))
            train_writer.add_summary(summary, i)
coord.request_stop()  
train_writer.close()
test_writer.close()
sess.close()

0 个答案:

没有答案