如何在Tensorflow中将正则化应用于LSTM?

时间:2017-05-11 21:36:02

标签: python tensorflow

我想将 L2正则化应用于我的LSTM模型。我的问题是我无法访问图的重量来计算L2范数。

我正在使用 Estimator 实例,我将模型函数作为参数构建图形。 在我的模型函数中,我创建了一个LSTM单元格,tf.contrib.rnn.LSTMCell包裹在static_rnn中。 现在我想将L2正则化添加到损失中,但我不知道如何访问LSTM权重。

我的问题是我无法打印任何内容。 我不知道图形变量的名称,但我无法打印它们,因为模型将不生成,因为我不知道要提供给的变量的名称tf.nn.l2_loss()

如何访问LSTM权重?致电tf.trainable_variables()tf.get_variable()

更一般地说,我怎样才能打印有关图表的信息我想在创建它之前创建它?

请具体......

----------- 编辑:

根据需要,这是一个最小模型。因为我是Tensorflow的新手,所以它可能并不是最小的也不是有效的。

import tensorflow as tf
from tensorflow.contrib import learn
from tensorflow.contrib.learn.python.learn.estimators import model_fn as model_fn_lib


def get_inp():
    x_train = tf.constant([[[1], [2], [3]],
                           [[2], [3], [4]],
                           [[3], [4], [5]],
                           [[4], [5], [6]]],
                          dtype=tf.float32)
    y_train = tf.constant([[4],
                           [5],
                           [6],
                           [7]],
                          dtype=tf.float32)
    return x_train, y_train


def lstm_model(features, target, mode, params):
    x_ = tf.unstack(features, axis=0)

    # LSTM Layer and wrappers
    single_lstm = tf.contrib.rnn.LSTMCell(params['hidden_size'],
                                          initializer=tf.random_uniform_initializer(-2, 2))
    wrapper = tf.contrib.rnn.MultiRNNCell([single_lstm])
    output, layers = tf.contrib.rnn.static_rnn(wrapper, x_, dtype=tf.float32)

    # Linear output layer to achieve the right output size
    output = tf.reshape(output, [features.get_shape().as_list()[0], -1], name='reshape')
    W = tf.get_variable("W",
                        [3 * params['hidden_size'], 1],
                        initializer=tf.random_uniform_initializer(-2, 2),
                        dtype=tf.float32)

    b = tf.get_variable("b",
                        [1],
                        initializer=tf.constant_initializer(0.0),
                        dtype=tf.float32)
    predictions = tf.unstack(tf.transpose(tf.matmul(output, W) + b), axis=1, name='unstack')

    predictions_dict = {"predictions": predictions}

    # Loss function
    loss = tf.losses.mean_squared_error(target, predictions)

    train_op = tf.contrib.layers.optimize_loss(loss=loss,
                                               global_step=tf.contrib.framework.get_global_step(),
                                               learning_rate=params["learning_rate"],
                                               optimizer=params["optimizer"],
                                               name='optimize_loss')

    return model_fn_lib.ModelFnOps(mode=mode,
                                   predictions=predictions_dict,
                                   loss=loss,
                                   train_op=train_op)


model_params = {'learning_rate': 0.01,
                'optimizer': 'Adam',
                'hidden_size': 2}

# Creates the estimator instance
regressor = learn.Estimator(model_fn=lstm_model,
                            params=model_params,
                            config=tf.contrib.learn.RunConfig(save_checkpoints_secs=15))

regressor.fit(input_fn=lambda: get_inp(),
              steps=1)

print(regressor.get_variable_names())

# As suggested by Engineero
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    variable_names = [v.name for v in tf.trainable_variables()]
print(variable_names)

哪个输出:

['W', 'b', 'global_step', 'optimize_loss/W/Adam', 'optimize_loss/W/Adam_1', 'optimize_loss/b/Adam', 'optimize_loss/b/Adam_1', 'optimize_loss/beta1_power', 'optimize_loss/beta2_power', 'optimize_loss/learning_rate', 'optimize_loss/rnn/multi_rnn_cell/cell_0/lstm_cell/biases/Adam', 'optimize_loss/rnn/multi_rnn_cell/cell_0/lstm_cell/biases/Adam_1', 'optimize_loss/rnn/multi_rnn_cell/cell_0/lstm_cell/weights/Adam', 'optimize_loss/rnn/multi_rnn_cell/cell_0/lstm_cell/weights/Adam_1', 'rnn/multi_rnn_cell/cell_0/lstm_cell/biases', 'rnn/multi_rnn_cell/cell_0/lstm_cell/weights']
[]

1 个答案:

答案 0 :(得分:1)

获取所有可训练变量名称的列表:

with tf.Session() as sess:
    sess.run(init_op)  # initialize things first
    variable_names = [v.name for v in tf.trainable_variables()]

如果您想查看变量的值,请使用类似的内容(仍在with tf.Session()组内):

variable_values = [sess.run(v) for v in tf.trainable_variables()]

我喜欢在开始训练循环之前查看变量的名称和形状,以确保权重和偏差是我期望的大小:

variable_shapes = [v.get_shape() for v in tf.trainable_variables()]
for name, shape in zip(variable_names, variable_shapes):
    print('{}\nShape: {}'.format(name, shape))
# ... training loop starts later ...

知道变量的名称后,可以选择以下内容:

var = [v for v in tf.trainable_variables if v.name == expected_name][0]

然后我认为你可以从那里直接应用你的正规化。