我做了什么

Question

我想在this paper中实现2D LSTM，特别是我想动态地这样做，所以使用tf.while。简而言之，该网络的工作原理如下。

对图像中的像素进行排序，使得像素i，j - > i * width + j
在此序列上运行2D-LSTM

2D和常规LSTM之间的区别是我们在序列中的前一个元素和当前像素正上方的像素之间有一个循环连接，所以在像素i，j是与i - 1，j和i的连接， j - 1。

我做了什么

我尝试使用tf进行此操作。而在循环的每次迭代中，我将激活和单元状态累积到我允许变化的形状的张量中。这就是下面的代码块尝试做的事情。

def single_lstm_layer(inputs, height, width, units, direction = 'tl'):
    with tf.variable_scope(direction) as scope:
        #Get 2D lstm cell
        cell = lstm_cell

        #position in sequence
        row, col = tf.to_int32(0), tf.to_int32(0)

        #use for when i - 1 < 0 or j - 1 < 0
        zero_state = tf.fill([1, units], 0.0)

        #get first activation and cell_state
        output, state = cell(inputs.read(row * width + col), zero_state, zero_state, zero_state, zero_state)

        #these are currently of shape (1, units) will ultimately be of shape
        #(height * width, untis)
        activations = output
        cell_states = state
        col += 1

    with tf.variable_scope(direction, reuse = True) as scope:

        def loop_fn(activations, cell_states, row, col):
            #Read next input in sequence
            i = inputs.read(row * width + col)

            #if we are not in the first row then we want to get the activation/cell_state
            #above us. Otherwise use zero state.
            hidden_state_t = tf.cond(tf.greater_equal(row - 1, 0), 
                                    lambda:tf.gather(activations, [(row - 1) * (width) + col]),
                                    lambda:tf.identity(zero_state))
            cell_state_t = tf.cond(tf.greater_equal(row - 1, 0), 
                                    lambda:tf.gather(cell_states, [(row - 1) * (width) + col]),
                                    lambda:tf.identity(zero_state))

            #if we are not in the first col then we want to get the activation/cell_state
            #left of us. Otherwise use zero state.
            hidden_state_l = tf.cond(tf.greater_equal(col - 1, 0), 
                                    lambda:tf.gather(activations, [row * (width) + col - 1]),
                                    lambda:tf.identity(zero_state))
            cell_state_l = tf.cond(tf.greater_equal(col - 1, 0), 
                                    lambda:tf.gather(cell_states, [row * (width) + col - 1]),
                                    lambda:tf.identity(zero_state))

            #Using previous activations/cell_states get current activation/cell_state
            output, state = cell(i, hidden_state_l, hidden_state_t, cell_state_l, cell_state_t)

            #Append to bottom, will increase number of rows by 1
            activations = tf.concat(0, [activations, output])
            cell_states = tf.concat(0, [cell_states, state])

            #move to next item in sequence
            col = tf.cond(tf.equal(col, width - 1), lambda:tf.mul(col, 0), lambda:tf.add(col, 1))
            row = tf.cond(tf.equal(col, 0), lambda:tf.add(row, 1), lambda:tf.identity(row))
            return activations, cell_states, row, col,
        row, col = tf.to_int32(0), tf.constant(1)
        activations, cell_states, _, _ = tf.while_loop(
                                              cond = lambda activations, cell_states, row, col: tf.logical_and(tf.less_equal(row , (height - 1)), tf.less_equal(col, width -1)) ,
                                              body = loop_fn,
                                              loop_vars = (activations,   
                                                        cell_states, 
                                                        row, 
                                                        col),
                                              shape_invariants = (tf.TensorShape((None, units)), 
                                                                tf.TensorShape((None, units)),
                                                                tf.TensorShape([]),
                                                                tf.TensorShape([]),
                                                                ),
                                                        )
        #Return activations with shape [height, width, units]
        return tf.pack(tf.split(0, height, activations))

这至少在向前的方向上起作用。也就是说，如果我看一下会话中返回的内容，那么我得到我想要的3D张量，称之为T，形状[高度，宽度，单位]，其中T [i，j，：]包含在输入i，j。

激活LSTM细胞

然后我想对每个像素进行分类，为此目的，我将T转换为T，然后将结果重新整形为[height * width，num_labels]并构造交叉熵损失。

    T = tf.nn.conv2d(T, W, strides = [1, 1, 1, 1], padding = 'VALID')
    T = tf.reshape(T, [height * width, num_labels])

    loss = tf.reduce_mean(
                        tf.nn.softmax_cross_entropy_with_logits(
                        labels = tf.reshape(labels, [height * width, num_labels]), 
                        logits = T)
                        )
    optimizer = tf.train.AdagradOptimizer(0.01).minimize(loss)

问题

然而现在我尝试使用28 x 28和32个单位的图像

    sess.run(optimizer, feed_dict = feed_dict)

我收到以下错误

File "Assignment2/train_model.py", line 52, in <module>
    train_models()
  File "/Assignment2/train_model.py", line 12, in train_models
    image, out, labels, optomizer, accuracy, prediction, ac = build_graph(28, 28)
  File "/Assignment2/multidimensional.py", line 101, in build_graph
    optimizer = tf.train.AdagradOptimizer(0.01).minimize(loss)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 196, in minimize
    grad_loss=grad_loss)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 253, in compute_gradients
    colocate_gradients_with_ops=colocate_gradients_with_ops)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients.py", line 491, in gradients
    in_grad.set_shape(t_in.get_shape())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 408, in set_shape
    self._shape = self._shape.merge_with(shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 579, in merge_with
    (self, other))
ValueError: Shapes (784, 32) and (1, 32) are not compatible

我认为这是计算由tf.while循环产生的渐变的问题，但我现在很丢失。

张量流中的动态图

我做了什么

问题

0 个答案: