这是从张量流中使用LSTM单元的简单示例。我正在产生一股浪潮并训练我的网络十个时期,我正在努力预测第十一期。预测值X是真y的一个时期滞后。训练之后,我将会话保存到磁盘,并在预测时恢复 - 这是培训和部署模型到生产的典型。
当我预测最后一个时期时,y_predicted非常匹配真实的y。
如果我尝试使用任意起点预测正弦波,(即取消注释线114) test_data = test_data [16:] 为了使y的真实值偏移四分之一周期,似乎LSTM预测仍然从零开始,并且需要几个时期来赶上真实值,最终匹配先前的预测。事实上,第二种情况下的预测似乎仍然是一个完整的正弦波,而不是3/4波。
发生这种情况的原因是什么。如果我实现了一个回归量,我想从任何一点开始使用它。
https://github.com/fbora/mytensorflow/issues/1
import os
import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow.contrib.rnn as rnn
def sin_signal():
'''
generate a sin function
the train set is ten periods in length
the test set is one additional period
the return variable is in pandas format for easy plotting
'''
phase = np.arange(0, 2*np.pi*11, 0.1)
y = np.sin(phase)
data = pd.DataFrame.from_dict({'phase': phase, 'y':y})
# fill the last element by 0 - it's the end of the period anyways
data['X'] = data.y.shift(-1).fillna(0.0)
train_data = data[data.phase<=2*np.pi*10].copy()
test_data = data[data.phase>2*np.pi*10].copy()
return train_data, test_data
class lstm_model():
def __init__(self, size_x, size_y, num_units=32, num_layers=3, keep_prob=0.5):
# def single_unit():
# return rnn.DropoutWrapper(
# rnn.LSTMCell(num_units), output_keep_prob=keep_prob)
def single_unit():
return rnn.LSTMCell(num_units)
self.graph = tf.Graph()
with self.graph.as_default():
'''input place holders'''
self.X = tf.placeholder(tf.float32, [None, size_x], name='X')
self.y = tf.placeholder(tf.float32, [None, size_y], name='y')
'''network'''
cell = rnn.MultiRNNCell([single_unit() for _ in range(num_layers)])
X = tf.expand_dims(self.X, -1)
val, state = tf.nn.dynamic_rnn(cell, X, time_major=True, dtype=tf.float32)
val = tf.transpose(val, [1, 0, 2])
last = tf.gather(val, int(val.get_shape()[0])-1)
weights = tf.Variable(tf.truncated_normal([num_units, size_y], 0.0, 1.0), name='weights')
bias = tf.Variable(tf.zeros(size_y), name='bias')
predicted_y = tf.nn.xw_plus_b(last, weights, bias, name='predicted_y')
'''optimizer'''
optimizer = tf.train.AdamOptimizer(name='adam_optimizer')
global_step = tf.Variable(0, trainable=False, name='global_step')
self.loss = tf.reduce_mean(tf.squared_difference(predicted_y, self.y), name='mse_loss')
self.train_op = optimizer.minimize(self.loss, global_step=global_step, name='training_op')
'''initializer'''
self.init_op = tf.global_variables_initializer()
class lstm_regressor():
def __init__(self):
if not os.path.isdir('./check_pts'):
os.mkdir('./check_pts')
@staticmethod
def get_shape(dataframe):
df_shape = dataframe.shape
num_rows = df_shape[0]
num_cols = 1 if len(df_shape)<2 else df_shape[1]
return num_rows, num_cols
def train(self, X_train, y_train, iterations):
train_pts, size_x = lstm_regressor.get_shape(X_train)
train_pts, size_y = lstm_regressor.get_shape(y_train)
model = lstm_model(size_x=size_x, size_y=size_y, num_units=32, num_layers=1)
with tf.Session(graph=model.graph) as sess:
sess.run(model.init_op)
saver = tf.train.Saver()
feed_dict={
model.X: X_train.values.reshape(-1, size_x),
model.y: y_train.values.reshape(-1, size_y)
}
for step in range(iterations):
_, loss = sess.run([model.train_op, model.loss], feed_dict=feed_dict)
if step%100==0:
print('step={}, loss={}'.format(step, loss))
saver.save(sess, './check_pts/lstm')
def predict(self, X_test):
test_pts, size_x = lstm_regressor.get_shape(X_test)
X_np = X_test.values.reshape(-1, size_x)
graph = tf.Graph()
with graph.as_default():
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver = tf.train.import_meta_graph('./check_pts/lstm.meta')
saver.restore(sess, './check_pts/lstm')
X = graph.get_tensor_by_name('X:0')
y_tf = graph.get_tensor_by_name('predicted_y:0')
y_np = sess.run(y_tf, feed_dict={X: X_np})
return y_np.reshape(test_pts)
def main():
train_data, test_data = sin_signal()
regressor = lstm_regressor()
regressor.train(train_data.X, train_data.y, iterations=1000)
# test_data = test_data[16:]
y_predicted = regressor.predict(test_data.X)
test_data['y_predicted'] = y_predicted
test_data[['y', 'y_predicted']].plot()
if __name__ == '__main__':
main()
答案 0 :(得分:0)
我怀疑,由于您在未来的任意起点开始预测,因此您的模型训练的内容与开始预测的内容之间存在一定的差距,以及LSTM的状态还没有更新那个差距中的值?
***更新:
在你的代码中,你有这个:
val, state = tf.nn.dynamic_rnn(cell, X, time_major=True, dtype=tf.float32)
然后在训练期间:
_, loss = sess.run([model.train_op, model.loss], feed_dict=feed_dict)
我建议将初始状态输入dynamic_rnn
并在每次训练迭代时重新提供更新的状态,如下所示:
inState = tf.placeholder(tf.float32, [YOUR_DIMENSIONS], name='inState')
val, state = tf.nn.dynamic_rnn(cell, X, time_major=True, dtype=tf.float32, initial_state=inState)
在训练期间:
iState = np.zeros([YOUR_DIMENSIONS])
feed_dict={
model.X: X_train.values.reshape(-1, size_x),
model.y: y_train.values.reshape(-1, size_y),
inState: iState # feed initial value for state placeholder
}
_, loss, oState = sess.run([model.train_op, model.loss, model.state], feed_dict=feed_dict) # run one additional variable from the session
iState = oState # assign latest out-state to be re-fed as in-state
因此,通过这种方式,您的模型不仅可以在训练期间学习参数,还可以跟踪在州内训练期间看到的所有内容。现在,您将此状态与会话的其余部分一起保存,并在预测阶段使用它。
这方面的一个小难点是技术上这个State是一个占位符,所以根据我的经验它不会自动保存在Graph中。因此,您在训练结束时手动创建另一个变量并将状态分配给它;这样它就会保存在图表中以供日后使用:
# make sure this variable is declared BEFORE the saver is declared
savedState = tf.get_variable('savedState', shape=[YOUR_DIMENSIONS])
# then, at the end of training:
assignOp = tf.assign(savedState, oState)
sess.run(assignOp)
# now save your graph
所以现在一旦你恢复图表,如果你想在一些人为的差距之后开始你的预测,那么不知何故你仍然必须通过这个差距运行你的模型以更新状态。在我的情况下,我只是为整个间隙运行一个虚拟预测,只是为了更新状态,然后从这里以正常的间隔继续。
希望这会有所帮助......