我现在正在使用tensorlfow数据集来实现SSD图像预处理方法。数据集的代码为:
dataset = dataset.prefetch(buffer_size=batch_size)
is_training = True if mode == 'Train' else False
if is_training:
dataset = dataset.shuffle(buffer_size=shuffle_buffer)
# If we are training over multiple epochs before
#evaluating, repeat the
# dataset for the appropriate number of epochs.
if num_epochs is not None:
dataset = dataset.repeat(num_epochs)
if is_training and num_gpus and examples_per_epoch:
total_examples = num_epochs * examples_per_epoch
total_batches = total_examples // batch_size //
num_gpus * num_gpus
dataset.take(total_batches * batch_size)
# Parse the raw records into images and labels. Testing
has shown that setting
# num_parallel_batches > 1 produces no improvement in
throughput, since
# batch_size is almost always much greater than the
number of CPU cores.
dataset = dataset.apply(
tf.contrib.data.map_and_batch(
parse_record_fn,
batch_size=batch_size,
num_parallel_batches=1,
drop_remainder=True))
dataset =
dataset.prefetch(buffer_size=tf.contrib.data.AUTOTUNE)
其中“ parse_record_fn”包括随机裁切和颜色饱和度。 “数据集”在CPU下进行。我将批次大小设置为32,将图像大小设置为640 * 640 * 3。实验结果表明,GPU使用率约为34%,这意味着在图像预处理上花费了太多时间。有什么方法可以提高速度?
谢谢!