我正在建立一个带有keras.layers.GRU
的{{1}}层的模型。但是,除了每个时间戳的输出之外,我还希望Layer返回两个额外的输出:每个时间戳都激活“忘记”和“重置”门。
为此,我创建了
(1)return_sequences=True
的修改副本,其中keras.layers.GRUCell
被return h, [h]
替换,并且
(2)return K.concatenate([h, z, r]), [h]
的修改后的副本,其中keras.layers.GRU
被self.output_size = self.units
替换,而return语句位于self.output_size = self.units * 3
,
call()
替换为
return super(GRU, self).call(
inputs, mask=mask, training=training, initial_state=initial_state)
尝试运行此命令时,出现以下错误:
return (rnn_output[:, :, :self.cell.units],
rnn_output[:, :, self.cell.units:self.cell.units * 2],
rnn_output[:, :, self.cell.units * 2:self.cell.units * 3])
问题似乎是keras / backend / tensorflow_backend.py中的this line,它假定输出和状态的大小应相同:
Traceback (most recent call last):
File "gru_with_gates.py", line 245, in <module>
main()
File "gru_with_gates.py", line 107, in main
rnn_output = GRUCellWithGates.GRU(cfg.OUTPUT_DIM, return_sequences=True)(full_feature_vector)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.py", line 532, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/base_layer.py", line 457, in __call__
output = self.call(inputs, **kwargs)
File ".../GRUCellWithGates.py", line 467, in call
initial_state=initial_state)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.py", line 649, in call
input_length=timesteps)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 3011, in rnn
maximum_iterations=input_length)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3209, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2941, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2878, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 2974, in _step
output = tf.where(tiled_mask_t, output, states[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2607, in where
return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 6699, in select
"Select", condition=condition, t=x, e=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1756, in __init__
control_input_ops)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1592, in _create_c_op
raise ValueError(str(e))
ValueError: Dimension 1 in both shapes must be equal, but are 768 and 256. Shapes are [?,768] and [?,256]. for 'gru_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,768], [?,256].
但是,https://github.com/keras-team/keras/blob/2.2.4/keras/layers/recurrent.py#L237中单元格的规范中未包含此假设。实际上,由于单元必须实现两个不同的属性output = tf.where(tiled_mask_t, output, states[0])
和state_size
,所以这不是问题。
1)我正确理解了吗?
2)如果是,我有什么办法可以解决此限制?
3)是否会有更好/更简便的方法来从GRU层获得忘记和重置门的激活?