Question

请先看下面的代码片段：

def inputs(filenames, batch_size, num_epochs, shuffle=True):

with tf.name_scope('input'):
    filename_queue = tf.train.string_input_producer(filenames, 
                                                    shuffle=shuffle,
                                                    capacity=32*batch_size,
                                                    num_epochs=num_epochs)


    if shuffle:
        image, label = read_and_decode_with_random_processing(filename_queue)
        images, sparse_labels = tf.train.shuffle_batch([image, label], 
                                                       batch_size=batch_size, 
                                                       num_threads=32,
                                                       capacity= 16 * batch_size,
                                                       min_after_dequeue= 8 * batch_size)
    else:
        image, label = read_and_decode(filename_queue)
        images, sparse_labels = tf.train.batch([image, label], 
                                               batch_size=batch_size, 
                                               num_threads=32,
                                               capacity=8 * batch_size,
                                               allow_smaller_final_batch=True)

    return images, sparse_labels

有三个capacity变量和两个num_threads变量。在我的实践中，我发现这些人很难确定让GPU忙于工作。似乎GPU比这些人加载数据的速度要快得多。你如何在实践中确定这些数字？在我看来，这些数字越大，它应该越好（因为队列总是充满数据，GPU不需要等待）。但是如果我将这些数字设置得太大，那么预取的速度就会很慢。也许我错过了什么。

如何确定张量流中批量输入的最佳配置？

0 个答案: