on the fly generation with Dataset api tensorflow

时间:2017-11-15 22:57:06

标签: python tensorflow tensorflow-datasets

I have a function which produces feature and target tensors. E.g.

x,t = myfunc() ##x,t tensors

How can I integrate this with TensorFlow's dataset API for continuous training? Ideally I would like to use dataset to set things like batch, transformations.

Edit for clarification: The problem being I would like to not just put x and t in my graph but make a dataset from them so that I can use the same dataset processing that I have implemented for (normal) finite datasets I can load into memory and feed into the same graph using an initializable iterator.

2 个答案:

答案 0 :(得分:2)

假设xttf.Tensor个对象,my_func()构建TensorFlow图,您可以使用以下方法使用`Dataset.map() :

# Creates an infinite dataset with a dummy value. You can make this finite by
# specifying an explicit number of elements to `repeat()`.
dummy_dataset = tf.data.Dataset.from_tensors(0).repeat(None)

# Evaluates `my_func` once for each element in `dummy_dataset`.
dataset = dummy_dataset.map(lambda _: my_func())

答案 1 :(得分:0)

如果x和t是张量,您可以通过调用tf.data.Dataset.from_tensorstf.data.Dataset.from_tensor_slices(文档here)来创建数据集。

它们之间的区别在于from_tensors将输入张量组合成数据集中的单个元素。 from_tensor_slices创建一个数据集,每个切片都有一个元素。