Question

我正在尝试将自定义py_func enqueue_op与TensorFlow RandomShuffleQueue和QueueRunner一起使用。我对TensorFlow很新，很困惑。这就是我现在所拥有的：

def compute_data(symbol, time):
    data = np.zeros((1330,))
    return data

key_1 = [str(x) for x in range(3000)]
key_2 = [str(y) for y in range(4800)]
tf_k1 = tf.constant([k for k in k1])
tf_k2 = tf.constant([k for k in k2])
tf_k1_index = tf.random_uniform((1,), minval=0, maxval=len(k1), dtype=tf.int32, name='k1_index')
tf_k2_index = tf.random_uniform((1,), minval=0, maxval=len(k2), dtype=tf.int32, name='k2_index')
tf_k1_variable = tf.gather_nd(tf_symbols, tf_k1_index)
tf_k2_variable = tf.gather_nd(tf_times, tf_k2_index)
tf_compute_data = tf.py_func(compute_data, [tf_k1_variable, tf_k2_variable], tf.float32, name='py_func_compute_data')

基本上我在这里想要实现的是给定两组密钥，每次随机抽样两个密钥的组合，并根据这两个密钥生成一段数据。数据生成过程涉及大量文件读取，现在被跳过，因为我想先正确地构建图形。

以下是应将tf_compute_data的结果排入queue的其余代码。

queue = tf.RandomShuffleQueue(
    capacity=20000, 
    min_after_dequeue=2000, 
    dtypes=[tf.float32], 
    shapes=[[1330]], 
    name='data_queue'
    )

enqueue_op = queue.enqueue(tf_compute_data)
tf_data = queue.dequeue_many(batch_size)

...

qr = tf.train.QueueRunner(queue, [enqueue_op] * 4)
sv = tf.train.Supervisor(logdir="logdir")
with sv.managed_session(config=config, start_standard_services=True) as sess:
    coord = tf.train.Coordinator()
    enqueue_threads = qr.create_threads(sess, coord=coord, start=True)

    for step in xrange(1000000):
        if coord.should_stop():
            break
        sess.run(train_op)
        print step

    coord.request_stop()
    coord.join(enqueue_threads)

当我运行脚本时，错误显示如下：

W tensorflow/core/framework/op_kernel.cc:993] Out of range: RandomShuffleQueue '_0_data_queue' is closed and has insufficient elements (requested 64, current size 0)
     [[Node: data_queue_DequeueMany = QueueDequeueManyV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](data_queue, data_queue_DequeueMany/n)]]
W tensorflow/core/framework/op_kernel.cc:993] Out of range: RandomShuffleQueue '_0_data_queue' is closed and has insufficient elements (requested 64, current size 0)
     [[Node: data_queue_DequeueMany = QueueDequeueManyV2[component_types=[DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](data_queue, data_queue_DequeueMany/n)]]

当我向compute_data函数添加日志记录时，它显示它只运行了4次，每个线程一次。只要coord.should_stop()为False，我该如何让它运行？

Answer 1

总结一下这些评论，有两个问题：

首先，with tf.Graph().as_default()从头开始，所以一切都需要在新图中重新定义。

其次，dtype返回的py_func有点棘手，因为numpy默认为float64，而大多数TensorFlow函数默认为float32。因此，在定义py_func时，可能需要将numpy数组的dtype显式设置为float32。这有一条错误消息，但我认为它已被写入不同的流（因此，如果您已经到达此页面以查找类似的队列错误并且py_func dtype匹配为#n＆＃ 39;问题是，请确保同时检查stdout和stderr是否存在潜在错误。

Tensorflow QueueRunner与py_func enqueue_op：如何使用？

1 个答案: