如何在Tensorflow中使用tf.nn.raw_rnn函数?

时间:2018-03-02 09:54:31

标签: python tensorflow machine-learning

我正在尝试实现基于LSTM的网络,其中在隐藏状态计算之后我们还在每个时间步骤应用线性+ sigmoid变换。我发现描述SELECT id, COUNT(id) FROM users GROUP BY id HAVING COUNT(id) > 1 函数的official documentationa nice article适合此任务,但我很难理解为什么它在我的特定情况下不起作用。

输入说明

因此,让我们对LSTM的输入是一个大小为tf.nn.raw_rnn的小批量,具体为[num_steps x batch_size x size]。让LSTM有200个隐藏单位。然后LSTM的输出是[5, 32, 100]张量,我们稍后可以将其用于损失计算。我假设输入[5, 32, 200]张量首先被卸载到[5, 32, 100]张量数组中,然后如果我们在Tensorflow中使用[32, 100] tf.nn.dynamic_rnn,则会堆叠回来:

time_major=True

隐藏状态模型

此外,在每个LSTM单元格之后,我需要执行线性+ sigmoid转换,例如将每个 tf.nn.dynamic_rnn(LSTM) LSTM t=0 LSTM t=1 LSTM t=2 LSTM t=3 LSTM t=4 [5, 32, 100] --> [[32, 100], [32, 100], [32, 100], [32, 100], [32, 100]] --> [5, 32, 200] 张量压缩到[32, 200]。我们的[32, 1]不适用于此,因为它只接受单元格。我们需要使用tf.nn.dynamic_rnn API。所以,这是我的尝试:

tf.nn.raw_rnn

遗憾的是,这不起作用。 def _get_raw_rnn_graph(self, inputs): time = tf.constant(0, dtype=tf.int32) _inputs_ta = tf.TensorArray(dtype=tf.float32, size=5) # our [5, 32, 100] tensor becomes [[32, 100], [32, 100], ...] _inputs_ta = _inputs_ta.unstack(inputs) # create simple LSTM cell cell = tf.contrib.rnn.LSTMCell(config.hidden_size) # create loop_fn for raw_rnn def loop_fn(time, cell_output, cell_state, loop_state): emit_output = cell_output # == None if time = 0 if cell_output is None: # time = 0 next_cell_state = cell.zero_state(32, tf.float32) self._initial_state = next_cell_state else: next_cell_state = cell_state elements_finished = (time >= 32) finished = tf.reduce_all(elements_finished) next_input = tf.cond(finished, lambda: tf.zeros([32, config.input_size], dtype=tf.float32), lambda: _inputs_ta.read(time)) # apply linear + sig transform here next_input = self._linear_transform(next_input, activation=tf.sigmoid) next_loop_state = None return (elements_finished, next_input, next_cell_state, emit_output, next_loop_state) outputs_ta, final_state, _ = tf.nn.raw_rnn(cell, loop_fn) outputs = outputs_ta.stack() return outputs, final_state 仅按我的预期迭代两次而不是loop_fn次,其输出为num_steps而不是我们预期的Tensor("Train/Model/TensorArrayStack/TensorArrayGatherV3:0", shape=(?, 32, 200), dtype=float32)。我在这里缺少什么?

0 个答案:

没有答案