如何使用Dynamic_rnn进行正则化

时间:2018-04-07 07:31:27

标签: tensorflow deep-learning regularized

我想在tensorflow中使用l2-regularizatin和Dynamic_rnn,但似乎目前没有正常处理。虽然循环是错误的来源。以下是重现问题的示例代码段

import numpy as np
import tensorflow as tf
tf.reset_default_graph()
batch = 2
dim = 3
hidden = 4

with tf.variable_scope('test', regularizer=tf.contrib.layers.l2_regularizer(0.001)):
    lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
    inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
    cell = tf.nn.rnn_cell.GRUCell(hidden)
    cell_state = cell.zero_state(batch, tf.float32)
    output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
    inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
                        [[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
                        dtype=np.int32)
    lengths_ = np.asarray([3, 1], dtype=np.int32)
this_throws_error = tf.losses.get_regularization_loss()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
    print(output_)

INFO:tensorflow:Cannot use 'test/rnn/gru_cell/gates/kernel/Regularizer/l2_regularizer' as input to 'total_regularization_loss' because 'test/rnn/gru_cell/gates/kernel/Regularizer/l2_regularizer' is in a while loop.

total_regularization_loss while context: None
test/rnn/gru_cell/gates/kernel/Regularizer/l2_regularizer while context: test/rnn/while/while_context

如果我的网络中有dynamic_rnn,如何添加l2正则化?目前我可以继续在损失计算中获得可训练的收集并在那里增加l2损失但我也有单词向量作为可训练的参数,我不想在其上规范化

1 个答案:

答案 0 :(得分:0)

我遇到了相同的问题,并且我一直在尝试使用tensorflow==1.9.0解决它。

代码:

import numpy as np
import tensorflow as tf
tf.reset_default_graph()
batch = 2
dim = 3
hidden = 4

with tf.variable_scope('test', regularizer=tf.contrib.layers.l2_regularizer(0.001)):
    lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
    inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
    cell = tf.nn.rnn_cell.GRUCell(hidden)
    cell_state = cell.zero_state(batch, tf.float32)
    output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
                        [[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
                        dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
this_throws_error = tf.losses.get_regularization_loss()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
    print(output_)
    print(sess.run(this_throws_error))

这是运行代码的结果:

...
File "/Users/piero/Development/mlenv3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_util.py", line 314, in CheckInputFromValidContext
    raise ValueError(error_msg + " See info log for more details.")
ValueError: Cannot use 'test/rnn/gru_cell/gates/kernel/Regularizer/l2_regularizer' as input to 'total_regularization_loss' because 'test/rnn/gru_cell/gates/kernel/Regularizer/l2_regularizer' is in a while loop. See info log for more details.

然后我尝试将dynamic_rnn调用放在变量范围之外:

import numpy as np
import tensorflow as tf
tf.reset_default_graph()
batch = 2
dim = 3
hidden = 4

with tf.variable_scope('test', regularizer=tf.contrib.layers.l2_regularizer(0.001)):
    lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
    inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
    cell = tf.nn.rnn_cell.GRUCell(hidden)
    cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
                        [[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
                        dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
this_throws_error = tf.losses.get_regularization_loss()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
    print(output_)
    print(sess.run(this_throws_error))

从理论上讲,这应该没问题,因为正则化适用于rnn的权重,其中应该包含创建rnn像元时初始化的变量。

这是输出:

[[[ 0.          0.          0.          0.        ]
  [ 0.1526176   0.33048663 -0.02288104 -0.1016309 ]
  [ 0.24402776  0.68280864 -0.04888818 -0.26671126]
  [ 0.          0.          0.          0.        ]]

 [[ 0.01998052  0.82368904 -0.00891946 -0.38874635]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]]]
0.0

因此,将dynami_rnn调用放在变量范围之外是可行的,因为它不会返回错误,但是损失的值为0,这表明实际上并没有考虑rnn的任何权重l2损失。

然后我尝试使用tensorflow==1.12.0。 这是范围内第一个dynamic_rnn的脚本的输出:

[[[ 0.          0.          0.          0.        ]
  [-0.17653276  0.06490126  0.02065791 -0.05175343]
  [-0.413078    0.14486027  0.03922977 -0.1465032 ]
  [ 0.          0.          0.          0.        ]]

 [[-0.5176822   0.03947531  0.00206934 -0.5542746 ]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]]]
0.010403235

这是在范围之外的dynamic_rnn输出:

[[[ 0.          0.          0.          0.        ]
  [ 0.04208181  0.03031874 -0.1749279   0.04617848]
  [ 0.12169671  0.09322995 -0.29029205  0.08247502]
  [ 0.          0.          0.          0.        ]]

 [[ 0.09673716  0.13300316 -0.02427006  0.00156245]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]
  [ 0.          0.          0.          0.        ]]]
0.0

范围内具有dynamic_rnn的版本返回非零值的事实表明该版本工作正常,而在其他情况下,返回值0则表明其行为不符合预期。 因此,最重要的是:这是tensorflow中的一个错误,他们已在版本1.9.0和版本1.12.0之间解决。