Tensorflow Dataset.from_generator阻止输入?

时间:2017-12-21 02:58:06

标签: python multithreading tensorflow

我想构建一个项目,在任意时间将请求放入python Queue,一组tensorflow模型使用队列中的请求,并立即返回结果。

模型有不同的线程,不同的tf.Graph,但结构和重量值是相同的。

每个模型都使用tf.data.Dataset.from_generator来封装一个从队列中获取请求的python迭代器。

问题是,当有多个模型时,请求可能会被阻止,直到将来的请求到来。从测试结果来看,似乎python迭代器确实在它被放入队列时得到了请求,但没有结果来自模型。此外,似乎没有丢弃请求,但可能被tf数据集迭代器阻止。

这是我的测试代码:

# -*- coding: utf-8 -*-

import tensorflow as tf
import numpy as np
import sys
import random
import time

from queue import Queue
from concurrent.futures import ThreadPoolExecutor

thread_count=int(sys.argv[1])
request_queue=Queue(128)

def data_iter():
    while True:
        yield request_queue.get()

def task():
    with tf.Graph().as_default():
        ds=tf.data.Dataset.from_generator(data_iter, (tf.int32), output_shapes=([1, 8]))
        sample=ds.make_one_shot_iterator().get_next()
        with tf.Session() as sess:
            coord=tf.train.Coordinator()
            threads=tf.train.start_queue_runners(sess=sess, coord=coord)
            while not coord.should_stop():
                try:
                    result=sess.run(sample)
                    print(result)
                except:
                    coord.request_stop()
            coord.join(threads)

executor=ThreadPoolExecutor(thread_count)
try:
    for i in range(thread_count):
        executor.submit(task)

    rand=random.Random()
    for i in range(100):
        request_queue.put(np.full((1, 8), i, 'int32'))
        time.sleep(1e-3)#to let the model get request from the request_queue
        t=rand.randint(5,10)
        print('round {}, request_queue size is about {}, sleeping {} secs...'.format(i, request_queue.qsize(), t))
        time.sleep(t)
finally:
    for i in range(thread_count):
        request_queue.put(None)
    executor.shutdown()

环境:python 3.5.3,tensorflow 1.4.0

测试结果:

  1. 使用一个型号运行:python tf_ds_test.py 1
  2. 结果如下:

    round 0, request_queue size is about 1, sleeping 6 secs...
    2017-12-21 10:42:24.924251: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
    [[0 0 0 0 0 0 0 0]]
    [[1 1 1 1 1 1 1 1]]
    round 1, request_queue size is about 0, sleeping 6 secs...
    [[2 2 2 2 2 2 2 2]]
    round 2, request_queue size is about 0, sleeping 5 secs...
    [[3 3 3 3 3 3 3 3]]
    round 3, request_queue size is about 0, sleeping 7 secs...
    [[4 4 4 4 4 4 4 4]]
    round 4, request_queue size is about 0, sleeping 6 secs...
    [[5 5 5 5 5 5 5 5]]
    round 5, request_queue size is about 0, sleeping 7 secs...
    ...
    

    一切顺利。

    1. 但是当使用32个模型运行时:python tf_ds_test.py 32
    2. 结果如下:

      2017-12-21 10:45:41.660251: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
      round 0, request_queue size is about 1, sleeping 9 secs...
      [[0 0 0 0 0 0 0 0]]
      [[1 1 1 1 1 1 1 1]]
      round 1, request_queue size is about 0, sleeping 5 secs...
      round 2, request_queue size is about 0, sleeping 8 secs...
      round 3, request_queue size is about 0, sleeping 10 secs...
      [[4 4 4 4 4 4 4 4]]
      [[2 2 2 2 2 2 2 2]]
      [[3 3 3 3 3 3 3 3]]
      round 4, request_queue size is about 0, sleeping 8 secs...
      round 5, request_queue size is about 0, sleeping 6 secs...
      round 6, request_queue size is about 0, sleeping 10 secs...
      [[6 6 6 6 6 6 6 6]]
      [[5 5 5 5 5 5 5 5]]
      round 7, request_queue size is about 0, sleeping 9 secs...
      [[7 7 7 7 7 7 7 7]]
      round 8, request_queue size is about 0, sleeping 5 secs...
      round 9, request_queue size is about 0, sleeping 10 secs...
      round 10, request_queue size is about 0, sleeping 6 secs...
      round 11, request_queue size is about 0, sleeping 10 secs...
      [[8 8 8 8 8 8 8 8]]
      round 12, request_queue size is about 0, sleeping 8 secs...
      

      请求已被阻止! python迭代器立即消耗了请求,但模型在任意时间段之前都没有给出结果,直到模型得到它的下一个请求。

      有人有任何想法吗?如何让这些模型立即返回结果?

1 个答案:

答案 0 :(得分:0)

您是否可以将生成元素的循环修改为:

for i in range(100):
    request_queue.put(np.full((1, 8), i, 'int32'))
    print('round {}, queue size {}'.format(i, request_queue.qsize()))

并分享输出?

我尝试重现你的问题(使用TF的每晚构建),但即使有1000个任务和10000次迭代循环,事情仍然顺利进行。

你能用TF的夜间版本试试这个吗?