Question

我在TensorFlow中使用lstm模型处理预测项目。然而，实现的结构起作用，结果是测试集的准确度仅为0.5。因此，我搜索了是否存在一些训练基于lstm的模型的技巧。然后我得到了＃34;添加了辍学＆＃34;。

但是，按照其他人的教程，会出现一些错误。

这是原始版本并且有效：

def lstmModel(x, weights, biases):
    x = tf.unstack(x, time_step, 1)

    lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
    outputs, states = rnn.static_rnn (lstm_cell, x, dtype=tf.float32)rnn.static_rnn)

    return tf.matmul(outputs[-1], weights['out']) + biases['out']

并在更改为以下后，会出现错误：

ValueError：Shape（90，？）的排名必须至少为3

def lstmModel(x, weights, biases):
    x = tf.unstack(x, time_step, 1)

    lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
    lstm_dropout = tf.nn.rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
    lstm_layers = rnn.MultiRNNCell([lstm_dropout]* 3)
    outputs, states = tf.nn.dynamic_rnn(lstm_layers, x, dtype=tf.float32)
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

如果输入数据的形状出错，我很困惑。在输入此功能之前，输入x的形状为(batch_size, time_step, data_size)

batch_size = 30 
time_step = 4 #read 4 words 
data_size = 80 # total 80 words, each is in np.shape of [1,80]

因此，每批的输入形状x为[30,4,80]。输入词x[0,0,80]后跟单词x[0,1,80]。设计是否有意义？

整个实现实际上是由其他教程修改的，我也想知道tf.unstack()实际上做了什么？

以上几个问题...... 我已将代码放在github中，其中包含＆＃34;工作版本＆＃34;和＆＃34;失败的版本＆＃34;上文提到的。只有提到的功能不同！请检查，谢谢！

Answer 1

从第二个示例中删除tf.unstack应该会有所帮助。

tf.unstack用于将张量分解为张量列表。在您的情况下，它会将大小为x的{{1}}分成长度为(batch_size, time_step, data_size)的列表，其中包含大小为time_step的张量。

这需要tf.nn.static_rnn，因为它在图表创建过程中展开了rnn，因此它需要一个预先指定的步数，即(batch_size, data_size)列表的长度。

tf.nn.dynamic_rnn在每次运行中展开，以便它可以执行可变数量的步骤，因此它需要一个张量，其中维度0是tf.unstack，维度1是batch_size和维度2是time_step（如果data_size为time_major，则前两个维度相反。）

错误是由于True期待3D张量，但由于tf.nn.dynamic_rnn，所提供的输入列表中的每个元素都是2D。

tl; dr将tf.unstack与tf.unstack一起使用，但绝不与tf.nn.static_rnn一起使用。

TensorFlow：lstm dropout实现，形状问题

1 个答案: