在Tensorflow中构建LSTM RNN时的尺寸不匹配

时间:2018-04-09 08:11:55

标签: tensorflow machine-learning neural-network lstm

我正在尝试在Tensorflow中构建多层,多类,多标签LSTM。我一直试图将this教程改为我的数据。

但是,我收到一条错误消息,指出构建RNN时我的尺寸不匹配。

ValueError:尺寸必须相等,但对于' rnn / while / rnn / multi_rnn_cell / cell_0 / lstm_cell / MatMul_1' (op:' MatMul')输入形状:[?,1000],[923,2000]。

我无法确定构建体系结构中哪个变量不正确:

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.0, shape=shape)
    return tf.Variable(initial)


def lstm(x, weight, bias, n_steps, n_classes):

    cell = rnn_cell.LSTMCell(cfg.n_hidden_cells_in_layer, state_is_tuple=True)
    multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)

    # FIXME : ERROR binding x to LSTM as it is
    output, state = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
    # FIXME : ERROR

    output_flattened = tf.reshape(output, [-1, cfg.n_hidden_cells_in_layer])
    output_logits = tf.add(tf.matmul(output_flattened, weight), bias)

    output_all = tf.nn.sigmoid(output_logits)
    output_reshaped = tf.reshape(output_all, [-1, n_steps, n_classes])

    # ??? switch batch size with sequence size. ???
    # then gather last time step values
    output_last = tf.gather(tf.transpose(output_reshaped, [1, 0, 2]), n_steps - 1)


    return output_last, output_all

这些是我的占位符,损失函数和所有爵士乐:

x_test, y_test = load_multiple_vector_files(test_filepaths)
x_valid, y_valid = load_multiple_vector_files(valid_filepaths)

n_input, n_steps, n_classes = get_input_target_lengths(check_print=False)


# FIXME n_input should be the problem
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
y_steps = tf.placeholder("float", [None, n_classes])

weight = weight_variable([cfg.n_hidden_layers, n_classes])
bias = bias_variable([n_classes])
y_last, y_all = lstm(x, weight, bias, n_steps, n_classes)

#all_steps_cost=tf.reduce_mean(-tf.reduce_mean((y_steps * tf.log(y_all))+(1 - y_steps) * tf.log(1 - y_all),reduction_indices=1))
all_steps_cost = -tf.reduce_mean((y_steps * tf.log(y_all)) + (1 - y_steps) * tf.log(1 - y_all))
last_step_cost = -tf.reduce_mean((y * tf.log(y_last)) + ((1 - y) * tf.log(1 - y_last)))
loss_function = (cfg.alpha * all_steps_cost) + ((1 - cfg.alpha) * last_step_cost)

optimizer = tf.train.AdamOptimizer(learning_rate=cfg.learning_rate).minimize(loss_function)

我很确定这是导致问题的 X占位符,导致图层与其矩阵尺寸不匹配。链接示例使用的常量很难看出它实际代表什么。

任何人都可以帮助我吗? :)

更新 我做了一个有根据的猜测"在不匹配的维度上。 一个是2 * hidden_​​width,因此隐藏获取新输入+其旧的循环输入。但是,不匹配的维度是 input_width + hidden_​​width ,就像它试图将隐藏图层的宽度设置为输入图层一样。

1 个答案:

答案 0 :(得分:0)

我发现我错误地设置了权重变量,使用n_hidden_​​layers(隐藏层数)的常量而不是n_hidden_​​cells_in_layer(层数)。