具有不同大小的输出和状态的RNNCell

时间:2018-12-04 13:22:47

标签: python tensorflow keras

我正在建立一个带有keras.layers.GRU的{​​{1}}层的模型。但是,除了每个时间戳的输出之外,我还希望Layer返回两个额外的输出:每个时间戳都激活“忘记”和“重置”门。

为此,我创建了

(1)return_sequences=True的修改副本,其中keras.layers.GRUCellreturn h, [h]替换,并且

(2)return K.concatenate([h, z, r]), [h]的修改后的副本,其中keras.layers.GRUself.output_size = self.units替换,而return语句位于self.output_size = self.units * 3

call()

替换为

return super(GRU, self).call(
    inputs, mask=mask, training=training, initial_state=initial_state)

尝试运行此命令时,出现以下错误:

return (rnn_output[:, :, :self.cell.units],
        rnn_output[:, :, self.cell.units:self.cell.units * 2],
        rnn_output[:, :, self.cell.units * 2:self.cell.units * 3])

问题似乎是keras / backend / tensorflow_backend.py中的this line,它假定输出和状态的大小应相同:

Traceback (most recent call last): File "gru_with_gates.py", line 245, in <module> main() File "gru_with_gates.py", line 107, in main rnn_output = GRUCellWithGates.GRU(cfg.OUTPUT_DIM, return_sequences=True)(full_feature_vector) File "/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.py", line 532, in __call__ return super(RNN, self).__call__(inputs, **kwargs) File "/usr/local/lib/python2.7/dist-packages/keras/engine/base_layer.py", line 457, in __call__ output = self.call(inputs, **kwargs) File ".../GRUCellWithGates.py", line 467, in call initial_state=initial_state) File "/usr/local/lib/python2.7/dist-packages/keras/layers/recurrent.py", line 649, in call input_length=timesteps) File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 3011, in rnn maximum_iterations=input_length) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3209, in while_loop result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2941, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2878, in _BuildLoop body_result = body(*packed_vars_for_body) File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 2974, in _step output = tf.where(tiled_mask_t, output, states[0]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2607, in where return gen_math_ops.select(condition=condition, x=x, y=y, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 6699, in select "Select", condition=condition, t=x, e=y, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1756, in __init__ control_input_ops) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1592, in _create_c_op raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 768 and 256. Shapes are [?,768] and [?,256]. for 'gru_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,768], [?,256].

但是,https://github.com/keras-team/keras/blob/2.2.4/keras/layers/recurrent.py#L237中单元格的规范中未包含此假设。实际上,由于单元必须实现两个不同的属性output = tf.where(tiled_mask_t, output, states[0])state_size,所以这不是问题。

1)我正确理解了吗?

2)如果是,我有什么办法可以解决此限制?

3)是否会有更好/更简便的方法来从GRU层获得忘记和重置门的激活?

0 个答案:

没有答案