使用数据集

时间:2017-08-30 09:00:11

标签: tensorflow

我正在尝试使用the from_generator interface for the Dataset API注入多个"轮次"输入到图表中。

first attempt上,我使用repeat() function使发生器连续运行3次。但是,batch_join call批量大小是每轮迭代次数的偶数倍(批量大小为3次的10次迭代),来自不同"的数据。轮" /" epochs"最终在同一批次中(取决于张量的处理顺序;图中有一些并行性)。

在我的second attempt上,我尝试在每个纪元完成后重新运行迭代器。但是,只要tf.errors.OutOfRangeError is thrown sess.run() on the output of the batch callrerunning the iterator's initializer的所有后续调用都会再次抛出OutOfRangeError,即使在shortcut methods to embed YouTube video (works on mobile too) See here之后。

我想将多轮输入连续注入图形中,而不是像第一个例子那样重叠它们(例如在批处理选项上使用allow_smaller_final_batch)。我在自定义Tensorflow fork中实例化的一些内核重启非常昂贵,例如mmap一个O(10gb)的文件,所以我想以某种方式充分利用这两个世界。

1 个答案:

答案 0 :(得分:2)

I think the problem stems from using template <class T> T* LoadableObject::instantiate(const CreationData& cd) { T* obj = new(T); if (!obj->initialize(cd)) { delete(obj); obj = NULL; } return obj; } (which supports reinitialization) with <a :href="'https://link-to-whatever.com/'+ (diced==6 ? 'winner' : 'looser')">LINK</a> (which uses TensorFlow queues and queue-runners, and hence does not support reinitialization).

I'm not completely clear what your code is doing, but I think you can implement the entire pipeline as a struct Employee { public string FirstName { get; set; } public string LastName { get; set; } public string FullName { get => FirstName + LastName; } public int Age { get; set; } public string Role { get; set; } public double Salary { get; set; } public bool IsManager { get; set; } public static void EditInfo(ref Employee e1, ref Employee e2) { if (e1.IsManager) { Console.WriteLine($"Feel free to set {e2.FirstName}'s info: "); e2.FirstName = Console.ReadLine(); e2.LastName = Console.ReadLine(); Console.WriteLine($"From now on thy name is {e2.FullName}!"); } else if (e2.IsManager) { Console.WriteLine($"Feel free to set {e1.FirstName}'s info: "); e1.FirstName = Console.ReadLine(); e1.LastName = Console.ReadLine(); Console.WriteLine($"From now on thy name is {e1.FullName}!"); } } } . Replace the following fragment of code:

tf.contrib.data.Dataset

...with something like the following:

tf.train.batch_join()