RNN序列学习

时间:2017-11-02 21:27:43

标签: tensorflow deep-learning rnn

我是TensorFlow RNN预测的新手。 我正在尝试使用带有BasicLSTMCell的RNN来预测序列,例如

1,2,3,4,5 ->6
3,4,5,6,7 ->8
35,36,37,38,39 ->40

我的代码没有报告错误,但每批次的输出似乎相同,而且培训时成本似乎没有降低。

当我将所有训练数据除以100时

0.01,0.02,0.03,0.04,0.05 ->0.06
0.03,0.04,0.05,0.06,0.07 ->0.08 
0.35,0.36,0.37,0.38,0.39 ->0.40

结果非常好,预测值与实际值之间的相关性非常高(0.9998)。

我怀疑问题是因为整数和浮点数?但我无法解释原因。有人可以帮忙吗?非常感谢!!

这是代码

library(tensorflow)
start=sample(1:1000, 100000, T)
start1= start+1
start2=start1+1
start3= start2+1
start4=start3+1
start5= start4+1
start6=start5+1
label=start6+1
data=data.frame(start, start1, start2, start3, start4, start5, start6, label)
data=as.matrix(data)
n = nrow(data)
trainIndex = sample(1:n, size = round(0.7*n), replace=FALSE)
train = data[trainIndex ,]
test = data[-trainIndex ,]
train_data= train[,1:7]
train_label= train[,8]
means=apply(train_data, 2, mean)
sds= apply(train_data, 2, sd)
train_data=(train_data-means)/sds
test_data=test[,1:7]
test_data=(test_data-means)/sds
test_label=test[,8]
batch_size = 50L            
n_inputs = 1L               # MNIST data input (img shape: 28*28)
n_steps = 7L                # time steps
n_hidden_units = 10L        # neurons in hidden layer
n_outputs = 1L             # MNIST classes (0-9 digits)
x = tf$placeholder(tf$float32, shape(NULL, n_steps, n_inputs))
y = tf$placeholder(tf$float32, shape(NULL, 1L))
weights_in= tf$Variable(tf$random_normal(shape(n_inputs, n_hidden_units)))
weights_out= tf$Variable(tf$random_normal(shape(n_hidden_units, 1L)))
biases_in=tf$Variable(tf$constant(0.1, shape= shape(n_hidden_units )))
biases_out = tf$Variable(tf$constant(0.1, shape=shape(1L)))
RNN=function(X, weights_in, weights_out, biases_in, biases_out)
{
    X = tf$reshape(X, shape=shape(-1, n_inputs))
    X_in = tf$sigmoid (tf$matmul(X, weights_in) + biases_in)
    X_in = tf$reshape(X_in, shape=shape(-1, n_steps, n_hidden_units)
    lstm_cell = tf$contrib$rnn$BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=T)
    init_state = lstm_cell$zero_state(batch_size, dtype=tf$float32)
    outputs_final_state = tf$nn$dynamic_rnn(lstm_cell, X_in, initial_state=init_state, time_major=F)
    outputs= tf$unstack(tf$transpose(outputs_final_state[[1]], shape(1,0,2)))
    results =  tf$matmul(outputs[[length(outputs)]], weights_out) + biases_out
    return(results)
}
pred = RNN(x, weights_in, weights_out, biases_in, biases_out)
cost = tf$losses$mean_squared_error(pred, y)
train_op = tf$contrib$layers$optimize_loss(loss=cost, global_step=tf$contrib$framework$get_global_step(), learning_rate=0.05, optimizer="SGD")
init <- tf$global_variables_initializer()
sess <- tf$Session()
sess.run(init)
    step = 0
while (step < 1000)
{
  train_data2= train_data[(step*batch_size+1) : (step*batch_size+batch_size) ,  ]
  train_label2=train_label[(step*batch_size+1):(step*batch_size+batch_size)]
  batch_xs <- sess$run(tf$reshape(train_data2, shape(batch_size, n_steps, n_inputs))) # Reshape
  batch_ys= matrix(train_label2, ncol=1)
  sess$run(train_op, feed_dict = dict(x = batch_xs, y= batch_ys))
 mycost <- sess$run(cost, feed_dict = dict(x = batch_xs, y= batch_ys))
print (mycost)
 test_data2= test_data[(0*batch_size+1) : (0*batch_size+batch_size) ,  ]
  test_label2=test_label[(0*batch_size+1):(0*batch_size+batch_size)]
   batch_xs <- sess$run(tf$reshape(test_data2, shape(batch_size, n_steps, n_inputs))) # Reshape
  batch_ys= matrix(test_label2, ncol=1)
step=step+1
}

1 个答案:

答案 0 :(得分:1)

首先,始终规范化您的网络输入非常有用(有不同的方法,除以最大值,减去平均值并除以标准和更多)。这将有助于优化器。

其次,实际上在您的情况下最重要的是,在RNN输出之后,您正在应用sigmoid函数。如果检查sigmoid函数的图,您将看到它实际上将所有输入扩展到范围(0,1)。所以基本上无论你的输入有多大,你的输出总是最多为1.因此你不应该在回归问题的输出层使用任何激活函数。

希望它有所帮助。