我正在使用TensorFlow版本0.9实现双向标记GRU网络(向前1层,向后1层)。在初始化模型时,TensorFlow初始化所有变量,创建GRU单元并正确应用所有常规转换,直到运行tf.nn.bidirectional_rnn
函数时,它会抛出与错误形状的Tensor合并相关联的ValueError。操作。这是代码:
# Create the cells
with tf.variable_scope('forward'):
self.char_gru_cell_fw = tf.nn.rnn_cell.GRUCell(char_hidden_size)
with tf.variable_scope('backward'):
self.char_gru_cell_bw = tf.nn.rnn_cell.GRUCell(char_hidden_size)
# Set initial state of the cells to be zero
self._char_initial_state_fw = \
self.char_gru_cell_fw.zero_state(batch_size, tf.float32)
self._char_initial_state_bw = \
self.char_gru_cell_bw.zero_state(batch_size, tf.float32)
# Size before: batch-chrpad-chrvocabsize
# Size after: batch-chrvocabsize
chargruinput = [tf.squeeze(input_, [1]) \
for input_ in tf.split(1, char_num_steps, chargruinput)]
# Run the bidirectional rnn and get the corner results
_, output_state_fw, output_state_bw = \
tf.nn.bidirectional_rnn(self.char_gru_cell_fw,
self.char_gru_cell_bw,
chargruinput,
sequence_length=char_num_steps,
initial_state_fw=self._char_initial_state_fw,
initial_state_bw=self._char_initial_state_bw)
当我运行它时,我收到以下错误:
Traceback (most recent call last):
File "frontbackgru.py", line 409, in <module>
main()
File "frontbackgru.py", line 226, in main
config=my_config)
File "/home/xG/Code/4-RNN/1-simple-cnn-input-classifier/gru_model.py", line 265, in __init__
initial_state_bw=self._char_initial_state_bw)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 453, in bidirectional_rnn
sequence_length, scope=fw_scope)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 156, in rnn
state_size=cell.state_size)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 343, in _rnn_step
_maybe_copy_some_through)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1331, in cond
_, res_f = context_f.BuildCondBranch(fn2)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1230, in BuildCondBranch
r = fn()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 317, in _maybe_copy_some_through
lambda: _copy_some_through(new_output, new_state))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1331, in cond
_, res_f = context_f.BuildCondBranch(fn2)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1230, in BuildCondBranch
r = fn()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 317, in <lambda>
lambda: _copy_some_through(new_output, new_state))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 298, in _copy_some_through
return ([math_ops.select(copy_cond, zero_output, new_output)]
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1769, in select
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 704, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2262, in create_op
set_shapes_for_outputs(ret)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1702, in set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1578, in _SelectShape
t_e_shape = t_e_shape.merge_with(c_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 570, in merge_with
(self, other))
ValueError: Shapes (32, 50) and () are not compatible
现在,bidirectional_rnn
函数的输入是:
self.char_gru_cell_fw
:这是一个用整数值char_hidden_size
初始化的GRUCell实例,在这种情况下为50
self.char_gru_cell_bw
:这是一个用整数值char_hidden_size
初始化的GRUCell实例,在这种情况下为50
chargruinput
:这是一个长度为30的列表,包含形状[batch_size
,charvocab
]的张量,在这种情况下为[32,256]
sequence_length
:一个整数,表示展开的单元格数char_num_steps
,在这种情况下为30。
initial_state_fw
:与GRU状态形状相同的零填充张量,在这种情况下为[32,50]
initial_state_bw
:与GRU状态形状相同的零填充张量,在这种情况下为[32,50]
我尝试查看导致抛出ValueError异常的模块,但是有很多低级别的东西正在发生,这很可能正常工作,看看我上周工作的CNN是如何工作的,没有任何问题。这让我觉得在低级方法之前,我之前没有用过的rnn
或rnn_cell
库出了问题。
它似乎也很奇怪,因为错误与空形状有关(与标量相关而不是我认为的Tensor),但我唯一可以改变的是bidirectional_rnn
函数中的标量arguments是sequence_length
参数。我试图省略它并仅使用初始状态,反之亦然,但会弹出相同的错误。
有没有人有类似的问题?我的整个系统都因此而瘫痪,会喜欢一些反馈。提前致谢
答案 0 :(得分:0)
弄清楚出了什么问题 - 参数sequence_length
实际上应该是每个批次的长度为batch_size
的整数列表,而不是整数本身。轻松修复,感谢您的演奏:)