TF.data迁移到dataset.interleave

时间:2019-09-09 07:07:35

标签: python tensorflow

TensorFlow每晚:1.15.0-dev20190730

filenames = tf.gfile.Glob(data_files_pattern)
dataset = tf.data.Dataset.from_tensor_slices(filenames).repeat()

def _read_fn(f):
  return tf.data.TFRecordDataset(f)

dataset = dataset.apply(tf.data.experimental.parallel_interleave(
    map_func=_read_fn,
    cycle_length=CYCLE_LENGTH,
    block_length=BLOCK_LENGTH,
    sloppy=True,
    buffer_output_elements=BUFFER_OUTPUT_ELEMENTS,
    prefetch_input_elements=BUFFER_INPUT_ELEMENTS))
dataset = dataset.batch(BATCH_SIZE, drop_remainder=False)
dataset = dataset.prefetch(PREFETCH)
return dataset

我收到以下警告:

WARNING:tensorflow:From sample.py:35: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0909 06:50:51.144233 140600866592512 deprecation.py:323] From sample.py:35: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.

为避免警告进行迁移时,我的读取速度会变慢,CPU使用率会降低:

filenames = tf.gfile.Glob(data_files_pattern)
dataset = tf.data.Dataset.from_tensor_slices(filenames).repeat()

def _read_fn(f):
   return tf.data.TFRecordDataset(f)

options = tf.data.Options()
options.experimental_deterministic = True
dataset = dataset.interleave(
    map_func=_read_fn,
    cycle_length=CYCLE_LENGTH,
    block_length=BLOCK_LENGTH,      
    num_parallel_calls=tf.data.experimental.AUTOTUNE).with_options(options)
dataset = dataset.batch(BATCH_SIZE, drop_remainder=False)
dataset = dataset.prefetch(PREFETCH)
return dataset

我正确迁移了吗?

1 个答案:

答案 0 :(得分:1)

问题是您正在比较草率的(不确定性)str1="hi my name is sam"与确定性parallel_interleave。您为interleave设置了sloppy=True,因此为了进行正确的迁移,您需要设置

parallel_interleave

options.experimental_deterministic = False