tf.nn.dynamic_rnn中的动态参数与静态参数大小

时间:2018-10-03 07:27:59

标签: tensorflow rnn cudnn

我试图了解神经机器翻译中使用的编码-解码机制。我正在尝试使用tf.contrib.cudnn_rnn.CudnnGRUHere是我正在参考的作品。在代码中,他已计算出CUDA RNN的静态参数大小,如下所示:

def cuda_params_size(cuda_model_builder):
    """
    Calculates static parameter size for CUDA RNN
    :param cuda_model_builder:
    :return:
    """
    with tf.Graph().as_default():
        cuda_model = cuda_model_builder()
        params_size_t = cuda_model.params_size()
        config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
        with tf.Session(config=config) as sess:
            result = sess.run(params_size_t)
            return result

编码器的结构如下:

def make_encoder(time_inputs, encoder_features_depth, is_train, hparams, seed, transpose_output=True):
    """
    Builds encoder, using CUDA RNN
    :param time_inputs: Input tensor, shape [batch, time, features]
    :param encoder_features_depth: Static size for features dimension
    :param is_train:
    :param hparams:
    :param seed:
    :param transpose_output: Transform RNN output to batch-first shape
    :return:
    """

    def build_rnn():
        return RNN(num_layers=hparams.encoder_rnn_layers, num_units=hparams.rnn_depth,
                   input_size=encoder_features_depth,
                   direction='unidirectional',
                   dropout=hparams.encoder_dropout if is_train else 0, seed=seed)

    static_p_size = cuda_params_size(build_rnn)
    cuda_model = build_rnn()
    params_size_t = cuda_model.params_size()
    with tf.control_dependencies([tf.assert_equal(params_size_t, [static_p_size])]):
        cuda_params = tf.get_variable("cuda_rnn_params",
                                      initializer=tf.random_uniform([static_p_size], minval=-0.05, maxval=0.05,
                                                                    dtype=tf.float32, seed=seed + 1 if seed else None))

我不明白的是,为什么他计算静态参数大小和动态参数大小不同?参数大小如何改变? (它们是通过创建RNN定义的。)

0 个答案:

没有答案