我正在尝试创建一种由4个双向lstm组成的网络体系结构,我想将所有这些求和并传递给一个线性层投影。
所以我的网络结构是:
Lstm
Lstm
Lstm
Lstm
Sum(All_four_lstm_last_state)
Linear_layer ( Sum )
Softmax( Linear_layer )
我的困惑是我为Lstm单元创建了一个函数,我想对所有四个rnn重用,但是我对variable_scope感到困惑。
所以我的代码如下:
首先,我创建了lstm单元函数:
with tf.variable_scope('encoder'):
# forward cell of Bi-directional lstm network
def fr_cell():
fr_cell_lstm = rnn.LSTMCell(num_units=rnn_num_units, forget_bias=forget_bias_)
return rnn.DropoutWrapper(cell=fr_cell_lstm, output_keep_prob=1. - dropout, dtype=tf.float32)
fr_cell_m = rnn.MultiRNNCell([fr_cell() for _ in range(1)], state_is_tuple=True)
with tf.variable_scope('encoder'):
def bw_cell():
bw_cell_lstm = rnn.LSTMCell(num_units=rnn_num_units, forget_bias=forget_bias_)
return rnn.DropoutWrapper(cell=bw_cell_lstm, output_keep_prob=1. - dropout, dtype=tf.float32)
bw_cell_m = rnn.MultiRNNCell([bw_cell() for _ in range(1)], state_is_tuple=True)
现在我有四个RNN:
# Bi-directional lstm network
with tf.variable_scope('encoder') as scope:
model, (state_c, state_h) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=fr_cell_m, # forward cell
cell_bw=bw_cell_m, # backward cell
inputs=long_word_embedding_lookup, # 3 dim embedding input for rnn
sequence_length= l_w_sequence_len, # sequence len == batch_size
dtype=tf.float32
)
transpose = tf.transpose(tf.concat(model, 2), [1, 0, 2])
long_word_state_output = tf.concat([state_c[0].c, state_h[0].c], axis=-1)
# Bi-directional lstm network
with tf.variable_scope('encoder') as scope:
model, (state_c, state_h) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=fr_cell_m, # forward cell
cell_bw=bw_cell_m, # backward cell
inputs=short_word_embedding_lookup, # 3 dim embedding input for rnn
sequence_length=s_w_sequence_len, # sequence len == batch_size
dtype=tf.float32
)
transpose = tf.transpose(tf.concat(model, 2), [1, 0, 2])
short_word_state_output = tf.concat([state_c[0].c, state_h[0].c], axis=-1)
with tf.variable_scope('encoder') as scope:
model, (state_c, state_h) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=fr_cell_m, # forward cell
cell_bw=bw_cell_m, # backward cell
inputs=long_char_embedding_lookup, # 3 dim embedding input for rnn
sequence_length=l_c_sequence_len, # sequence len == batch_size
dtype=tf.float32
)
transpose = tf.transpose(tf.concat(model, 2), [1, 0, 2])
long_char_state_output = tf.concat([state_c[0].c, state_h[0].c], axis=-1)
with tf.variable_scope('encoder') as scope:
model, (state_c, state_h) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=fr_cell_m, # forward cell
cell_bw=bw_cell_m, # backward cell
inputs=short_char_embedding_lookup, # 3 dim embedding input for rnn
sequence_length=s_c_sequence_len, # sequence len == batch_size
dtype=tf.float32
)
transpose = tf.transpose(tf.concat(model, 2), [1, 0, 2])
short_char_state_output = tf.concat([state_c[0].c, state_h[0].c], axis=-1)
您可以看到我为lstm和所有四个Rnn使用了相同的name_scope('encode')。
如果有人能给我一些建议是正确的方法,还是如果我的四个输入都不同并且我要总结一下,我应该修改什么,我将不胜感激。
谢谢。