更新 :我强烈认为该错误与创建的init_state有关,并被输入到tf.nn.dynamic_rnn(...)中一个论点。那么问题就变成了,堆叠RNN的初始状态的正确形状或构造方式是什么?
我试图让MultiRNNCell定义在TensorFlow 1.1中运行。
图形定义,具有用于定义GRU单元的辅助函数,如下所示。基本思想是将占位符x定义为数字数据样本的冗长字符串。通过整形将该数据分成相等长度的帧,并且将在每个时间步长处呈现一帧。然后我想通过一堆两个(现在的)GRU单元来处理它。
def gru_cell(state_size):
cell = tf.contrib.rnn.GRUCell(state_size)
return cell
graph = tf.Graph()
with graph.as_default():
x = tf.placeholder(tf.float32, [batch_size, num_samples], name="Input_Placeholder")
y = tf.placeholder(tf.int32, [batch_size, num_frames], name="Labels_Placeholder")
init_state = tf.zeros([batch_size, state_size], name="Initial_State_Placeholder")
rnn_inputs = tf.reshape(x, (batch_size, num_frames, frame_length))
cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(2)], state_is_tuple=False)
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)
图表定义从那里继续使用损失函数,优化器等。但是这就是它因以下冗长错误而崩溃的地方。
在错误的最后部分,batch_size为10,而frame_length和state_size都是80,这将是相关的。
ValueError Traceback (most recent call last)
<ipython-input-30-4c48b596e055> in <module>()
14 print(rnn_inputs)
15 cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(2)], state_is_tuple=False)
---> 16 rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)
17
18 with tf.variable_scope('softmax'):
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/rnn.pyc in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope)
551 swap_memory=swap_memory,
552 sequence_length=sequence_length,
--> 553 dtype=dtype)
554
555 # Outputs of _dynamic_rnn_loop are always shaped [time, batch, depth].
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/rnn.pyc in _dynamic_rnn_loop(cell, inputs, initial_state, parallel_iterations, swap_memory, sequence_length, dtype)
718 loop_vars=(time, output_ta, state),
719 parallel_iterations=parallel_iterations,
--> 720 swap_memory=swap_memory)
721
722 # Unpack final output if not using output tuples.
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name)
2621 context = WhileContext(parallel_iterations, back_prop, swap_memory, name)
2622 ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, context)
-> 2623 result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
2624 return result
2625
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in BuildLoop(self, pred, body, loop_vars, shape_invariants)
2454 self.Enter()
2455 original_body_result, exit_vars = self._BuildLoop(
-> 2456 pred, body, original_loop_vars, loop_vars, shape_invariants)
2457 finally:
2458 self.Exit()
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
2435 for m_var, n_var in zip(merge_vars, next_vars):
2436 if isinstance(m_var, ops.Tensor):
-> 2437 _EnforceShapeInvariant(m_var, n_var)
2438
2439 # Exit the loop.
/home/novak/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.pyc in _EnforceShapeInvariant(merge_var, next_var)
565 "Provide shape invariants using either the `shape_invariants` "
566 "argument of tf.while_loop or set_shape() on the loop variables."
--> 567 % (merge_var.name, m_shape, n_shape))
568 else:
569 if not isinstance(var, (ops.IndexedSlices, sparse_tensor.SparseTensor)):
ValueError: The shape for rnn/while/Merge_2:0 is not an invariant for the loop. It enters the loop with shape (10, 80), but has shape (10, 160) after one iteration. Provide shape invariants using either the `shape_invariants` argument of tf.while_loop or set_shape() on the loop variables.
这看起来几乎看起来网络以80s的2-stack开始,并以某种方式转换为160的1-stack。任何帮助修复这个?我误解了MultiRNNCell的使用吗?
答案 0 :(得分:1)
根据Allen Lavoie上面的评论,更正的代码是:
def gru_cell(state_size):
cell = tf.contrib.rnn.GRUCell(state_size)
return cell
num_layers = 2 # <---------
graph = tf.Graph()
with graph.as_default():
x = tf.placeholder(tf.float32, [batch_size, num_samples], name="Input_Placeholder")
y = tf.placeholder(tf.int32, [batch_size, num_frames], name="Labels_Placeholder")
init_state = tf.zeros([batch_size, num_layer * state_size], name="Initial_State_Placeholder") # <---------
rnn_inputs = tf.reshape(x, (batch_size, num_frames, frame_length))
cell = tf.contrib.rnn.MultiRNNCell([gru_cell(state_size) for _ in range(num_layer)], state_is_tuple=False) # <---------
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)
注意上面的三个变化。 另请注意,这些更改必须在init_state流动的任何地方波动,特别是如果您将它们提供给feed_dict。