工作环境
问题情景
我正在使用TensorFlow StagingArea操作来提高输入管道的效率。以下是构建输入管道的代码片段的一部分:
train_put_op_list = []
train_get_op_list = []
val_put_op_list = []
val_get_op_list = []
with tf.variable_scope(tf.get_variable_scope()) as vscope:
for i in range(4):
with tf.device('/gpu:%d'%i):
with tf.name_scope('GPU-Tower-%d'%i) as scope:
trainstagingarea = tf.contrib.staging.StagingArea(dtypes=[tf.float32, tf.int32],
shapes=[[64, 221, 221, 3],[64]],
capacity=0)
valstagingarea = tf.contrib.staging.StagingArea(dtypes=[tf.float32, tf.int32],
shapes=[[128, 221, 221, 3],[128]],
capacity=0)
train_put_op_list.append(trainstagingarea.put(train_iterator.get_next()))
val_put_op_list.append(valstagingarea.put(val_iterator.get_next()))
train_get_op_list.append(trainstagingarea.get())
val_get_op_list.append(valstagingarea.get())
with tf.device('/cpu:0'):
worktype = tf.get_variable("wt",[], initializer=tf.zeros_initializer(), trainable=False)
workcondition = tf.equal(worktype, 1)
#elem = tf.cond(workcondition, lambda: train_iterator.get_next(), lambda: val_iterator.get_next())
elem = tf.cond(workcondition, lambda: train_get_op_list[i], lambda: val_get_op_list[i])
# This is followed by the network construction and optimizer
现在在执行时,我首先运行put()
ops几次,然后继续运行迭代。如下所示:
with tf.Session(config=config) as sess:
sess.run(init_op)
sess.run(iterator_training_op)
sess.run(iterator_validation_op)
sess.run(tf.assign(worktype, 0))
for i in range(4):
sess.run(train_put_op_list)
sess.run(val_put_op_list)
writer = tf.summary.FileWriter('.', graph=tf.get_default_graph())
epoch = 0
iter = 0
previous = 0
while(epoch<10):
try:
if(PROCESSINGTYPE is 'validation'):
sess.run(val_put_op_list)
[val_accu, summaries, numsamp] = sess.run([running_accuracy, validation_summary_op, processed])
previous+=numsamp
print("Running Accuracy = {} : Number of sample processed = {} ".format(val_accu, previous))
else:
sess.run(train_put_op_list)
[loss_value, _, train_accu, summaries, batch_accu, numsamp] = sess.run([total_loss, apply_gradient_op, running_accuracy, training_summary_op, batch_accuracy, pr\
ocessed])
#Remaining part of the code (not important for question)
问题说明
StagingArea的使用大大提高了速度(几乎3-4倍)。
但是,由于某些阻止,代码会挂起。我不确定该块是来自get()
还是put()
操作。这是实际输出:
# Validation is done first and the following is the output
Running Accuracy = 0.0 : Number of sample processed = 512
Running Accuracy = 0.00390625 : Number of sample processed = 1024
Running Accuracy = 0.0 : Number of sample processed = 1536
Running Accuracy = 0.001953125 : Number of sample processed = 2048
# The code hangs here
您可以注意到,在tf.Session() as sess:
的开头,get()
和put()
操作的运行时间为4
次。输出也限制为4行。这意味着,
sess.run(val_put_op_list)
循环中的while
不会执行任何操作。因此,当get()
调用sess.run(running_accuracy)...
时,StagingArea
在4
行之后被发现为空,因此会发生阻塞。
get()
和put()
操作的正确方法是什么?StagingArea
已满且put()
被阻止,那还会阻止整个代码吗? TensorFlow文档没有说明任何内容。答案 0 :(得分:1)
看看https://github.com/tensorflow/tensorflow/pull/13684。这解决了一些死锁,可能会进入1.4.0。免责声明:我不是张花。