IteratorGetNext上的张量流性能瓶颈,效率不如tf.FIFOQueue

时间:2018-11-06 03:41:12

标签: python tensorflow

我已经使用tensorflow训练了语音识别网络,并且之前使用tf.FIFIQueue来馈送数据。训练速度不符合要求。使用tf.data.TFRecordDataset并将二进制文件转换为tfrecords时速度变慢,我参考了https://www.tensorflow.org/guide/performance/datasets,预处理代码如下:

def read_and_decode(loader, handle, num_epochs=1):
  """ read tfrecord format data"""
  batch_size = loader.batch_size()
  feature_size = model_settings['fingerprint_size']
  def parse_exmp(serialized_example):
    features = tf.parse_single_example(serialized_example, features={
                                                'feature':  tf.VarLenFeature(tf.float32),
                                                'label':    tf.VarLenFeature(tf.int64),
                                                'mask':     tf.VarLenFeature(tf.int64),
                                                'length':   tf.FixedLenFeature((),tf.int64)
                                                })
    length = tf.cast(features['length'], tf.int32)
    feature = tf.sparse_tensor_to_dense(features['feature'])
    feature = tf.reshape(feature, [length, feature_size])
    label = tf.sparse_tensor_to_dense(features['label'])
    mask = tf.sparse_tensor_to_dense(features['mask'])
    return feature, label, mask, length
  '''
  filenames = tf.data.Dataset.list_files("./train_input/tfrecords_file/train_dataset_*.tfrecords")
  dataset = filenames.apply(tf.contrib.data.parallel_interleave(\
            lambda filename: tf.data.TFRecordDataset(filename),cycle_length=4))
  '''
  filenames = ['./train_input/tfrecords_file/train_dataset_%d.tfrecords'%i for i in range(cpu_count()-1)]
  dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=10)

  dataset = dataset.map(parse_exmp, num_parallel_calls=48)
  dataset = dataset.prefetch(buffer_size=batch_size)
  dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(10000, num_epochs))
  dataset = dataset.padded_batch(batch_size, padded_shapes=([None, feature_size],[None],[None],[]))
  train_iterator = dataset.make_initializable_iterator()

  iterator = tf.data.Iterator.from_string_handle(handle, \
    dataset.output_types, dataset.output_shapes)

  batch_data, batch_label, batch_mask, batch_length = iterator.get_next()
  batch_data =[tf.transpose(data, (1,0,2)) for data in tf.split(batch_data, FLAGS.gpu_num, axis=0)]
  batch_label = tf.split(batch_label, FLAGS.gpu_num, axis=0)
  batch_mask = tf.split(batch_mask, FLAGS.gpu_num, axis=0)

  return train_iterator,batch_data,[tf.transpose(label, (1,0)) for label in batch_label], [tf.transpose(mask, (1,0)) for mask in batch_mask], batch_length

这是我的速度表:  enter image description here 我们可以看到tf.Dataset比tf.FIFOQueue慢。以及tf.FIFOQueue的时间表:enter image description here tf.Dataset的时间表: enter image description here

我们可以看到IteratorGetNext太长了大约200ms,tf.Dataset也很慢,即使没有IteratorGetNext。 我的问题是,即使我已经优化了代码,为什么tf.Dataset比tf.FIFOQueue那么长。 谢谢。

0 个答案:

没有答案