我已经使用tensorflow训练了语音识别网络,并且之前使用tf.FIFIQueue来馈送数据。训练速度不符合要求。使用tf.data.TFRecordDataset并将二进制文件转换为tfrecords时速度变慢,我参考了https://www.tensorflow.org/guide/performance/datasets,预处理代码如下:
def read_and_decode(loader, handle, num_epochs=1):
""" read tfrecord format data"""
batch_size = loader.batch_size()
feature_size = model_settings['fingerprint_size']
def parse_exmp(serialized_example):
features = tf.parse_single_example(serialized_example, features={
'feature': tf.VarLenFeature(tf.float32),
'label': tf.VarLenFeature(tf.int64),
'mask': tf.VarLenFeature(tf.int64),
'length': tf.FixedLenFeature((),tf.int64)
})
length = tf.cast(features['length'], tf.int32)
feature = tf.sparse_tensor_to_dense(features['feature'])
feature = tf.reshape(feature, [length, feature_size])
label = tf.sparse_tensor_to_dense(features['label'])
mask = tf.sparse_tensor_to_dense(features['mask'])
return feature, label, mask, length
'''
filenames = tf.data.Dataset.list_files("./train_input/tfrecords_file/train_dataset_*.tfrecords")
dataset = filenames.apply(tf.contrib.data.parallel_interleave(\
lambda filename: tf.data.TFRecordDataset(filename),cycle_length=4))
'''
filenames = ['./train_input/tfrecords_file/train_dataset_%d.tfrecords'%i for i in range(cpu_count()-1)]
dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=10)
dataset = dataset.map(parse_exmp, num_parallel_calls=48)
dataset = dataset.prefetch(buffer_size=batch_size)
dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(10000, num_epochs))
dataset = dataset.padded_batch(batch_size, padded_shapes=([None, feature_size],[None],[None],[]))
train_iterator = dataset.make_initializable_iterator()
iterator = tf.data.Iterator.from_string_handle(handle, \
dataset.output_types, dataset.output_shapes)
batch_data, batch_label, batch_mask, batch_length = iterator.get_next()
batch_data =[tf.transpose(data, (1,0,2)) for data in tf.split(batch_data, FLAGS.gpu_num, axis=0)]
batch_label = tf.split(batch_label, FLAGS.gpu_num, axis=0)
batch_mask = tf.split(batch_mask, FLAGS.gpu_num, axis=0)
return train_iterator,batch_data,[tf.transpose(label, (1,0)) for label in batch_label], [tf.transpose(mask, (1,0)) for mask in batch_mask], batch_length
这是我的速度表: enter image description here 我们可以看到tf.Dataset比tf.FIFOQueue慢。以及tf.FIFOQueue的时间表:enter image description here tf.Dataset的时间表: enter image description here
我们可以看到IteratorGetNext太长了大约200ms,tf.Dataset也很慢,即使没有IteratorGetNext。 我的问题是,即使我已经优化了代码,为什么tf.Dataset比tf.FIFOQueue那么长。 谢谢。