我已将数据集划分为10个tfrecords文件,我想从每个文件读取100个数据点,以创建一批包含100个数据点的10个序列。我使用以下函数来做到这一点。来自tfrecords的数据加载时间开始缓慢然后达到大约0.65秒并且在100-200 sess.run之后调用它增加到大约10秒。你能否指出任何可能有助于减少阅读时间的错误或建议?而且,我提到的行为有时变得更加不稳定。
def get_data(mini_batch_size):
data = []
for i in range(mini_batch_size):
filename_queue = tf.train.string_input_producer([data_path + 'Features' + str(i) + '.tfrecords'])
reader = tf.TFRecordReader()
_, serialized_example = reader.read_up_to(filename_queue,step_size)
features = tf.parse_example(serialized_example,features={'feature_raw': tf.VarLenFeature(dtype=tf.float32)})
feature = features['feature_raw'].values
feature = tf.reshape(feature,[step_size, ConvLSTM.H, ConvLSTM.W, ConvLSTM.Di])
data.append(feature)
return tf.stack(data)
即使我从单个文件中拉出如下,我也会观察到相同的行为。此外,增加num_threads并没有帮助。
with tf.device('/cpu:0'):
filename_queue = tf.train.string_input_producer(['./Data/TFRecords/Features' + str(i) + '.tfrecords'])
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
batch_serialized_example = tf.train.batch([serialized_example], batch_size=100, num_threads=1, capacity=100)
features = tf.parse_example(batch_serialized_example,features={'feature_raw': tf.VarLenFeature(dtype=tf.float32)})
feature = features['feature_raw'].values
data.append(feature)
data = tf.stack(data)
init_op = tf.group(tf.global_variables_initializer(),tf.local_variables_initializer())
sess = tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=1,inter_op_parallelism_threads=1,allow_soft_placement=True))
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
for i in range(1000):
t = time.time()
D = sess.run(data)
print(time.time()-t)
答案 0 :(得分:0)
我认为您正在尝试自行创建迷你批处理,但您应该使用tensorflow
或tf.train.shuffle_batch
之类的tf.train.batch
个队列为您执行此操作。
您的输入流程应该是:
# Create a filename queue: Read tfrecord filenames
filename_queue = tf.train.string_input_producer
#Create reader to populate the queue of examples
reader = tf.TFRecordReader()
_, serialized_example = reader.read_up_to(filename_queue,step_size)
#Parses the example proto
features = tf.parse_example(serialized_example,features={'feature_raw': tf.VarLenFeature(dtype=tf.float32)})
feature = features['feature_raw'].values
feature = tf.reshape(feature,[step_size, ConvLSTM.H, ConvLSTM.W, ConvLSTM.Di])
## Shuffling queue that creates batches of data
features = tf.train.shuffle_batch([feature], batch_size=batch_size, num_threads=2, capacity=MIN_AFTER_DEQUEUE + 3*batch_size, min_after_dequeue=MIN_AFTER_DEQUEUE)
为了改善您的数据加载时间,以下几点可以帮助您:
MIN_AFTER_DEQUEUE
非常重要。将其设置为较大的数字将启动较慢且内存较多,但运行时间数较多。 input data preprocessing
,而其余的计算密集型矩阵运算在GPU上运行。您的GPU利用率不接近100%,这意味着瓶颈来自CPU没有加载足够的数据。tfrecords
代替many
tdrecords
,以便在不切换多个文件的情况下更快地按顺序读取数据。raw
图片保存到tfrecords
,而是使用jpeg
或类似格式,以便缩小文件尺寸,读得更快。对于jpeg decode
,GPU
计算费用非常小。