我想构建一个用于回归的玩具LSTM模型。对于初学者来说,This漂亮的教程已经太复杂了。
给定长度为&&
的序列,预测下一个值。考虑time_steps
和序列:
time_steps=3
目标值应为:
array([
[[ 1.],
[ 2.],
[ 3.]],
[[ 2.],
[ 3.],
[ 4.]],
...
我定义了以下模型:
array([ 4., 5., ...
我们用# Network Parameters
time_steps = 3
num_neurons= 64 #(arbitrary)
n_features = 1
# tf Graph input
x = tf.placeholder("float", [None, time_steps, n_features])
y = tf.placeholder("float", [None, 1])
# Define weights
weights = {
'out': tf.Variable(tf.random_normal([n_hidden, 1]))
}
biases = {
'out': tf.Variable(tf.random_normal([1]))
}
#LSTM model
def lstm_model(X, weights, biases, learning_rate=0.01, optimizer='Adagrad'):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, time_steps, n_features)
# Required shape: 'time_steps' tensors list of shape (batch_size, n_features)
# Permuting batch_size and time_steps
input dimension: Tensor("Placeholder_:0", shape=(?, 3, 1), dtype=float32)
X = tf.transpose(X, [1, 0, 2])
transposed dimension: Tensor("transpose_41:0", shape=(3, ?, 1), dtype=float32)
# Reshaping to (time_steps*batch_size, n_features)
X = tf.reshape(X, [-1, n_features])
reshaped dimension: Tensor("Reshape_:0", shape=(?, 1), dtype=float32)
# Split to get a list of 'time_steps' tensors of shape (batch_size, n_features)
X = tf.split(0, time_steps, X)
splitted dimension: [<tf.Tensor 'split_:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:1' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:2' shape=(?, 1) dtype=float32>]
# LSTM cell
cell = tf.nn.rnn_cell.LSTMCell(num_neurons) #Or GRUCell(num_neurons)
output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
output = tf.transpose(output, [1, 0, 2])
last = tf.gather(output, int(output.get_shape()[0]) - 1)
return tf.matmul(last, weights['out']) + biases['out']
实例化LSTM模型我得到以下结果:
pred = lstm_model(x, weights, biases)
1)你知道问题是什么吗?
2)将LSTM输出乘以权重会产生回归吗?
答案 0 :(得分:8)
正如评论中所讨论的,tf.nn.dynamic_rnn(cell, inputs, ...)
函数需要一个三维张量列表 * 作为其inputs
参数,其中维度默认解释为{ {1}} x batch_size
x num_timesteps
。 (如果您通过num_features
,则会将其解释为time_major=True
x num_timesteps
x batch_size
。)因此,您在原始占位符中进行的预处理是不必要的,您可以将有效num_features
值直接传递给X
。
* 从技术上讲,除了列表之外,它还可以接受复杂的嵌套结构,但是叶子元素必须是三维张量。 **
** 对此进行调查后发现了tf.nn.dynamic_rnn()
实施中的一个错误。原则上,输入至少有两个维度就足够了,但是tf.nn.dynamic_rnn()
路径假设它们在将输入转换为时间主要形式时它们具有正好三个维度,并且它是错误消息这个bug无意中导致了你的程序中出现的错误。我们正在努力解决这个问题。