我正在尝试使用此架构准备网络,
卷积层
最大合并图层
卷积层
最大合并图层
卷积层
最大合并图层
BLSTM第1层
BLSTM第2层
完全连接的图层
CTC图层
我的代码结构就像
thisgraph=tf.Graph()
with thisgraph.as_default():
x=tf.placeholder(tf.float32,[None,1,None,nb_features])
y=tf.sparse_placeholder(tf.int32)
seq_len=tf.placeholder(tf.int32,[None])
# 1st CNN Layer followed by MP layer
# 2nd CNN Layer followed by MP layer
# 3rd CNN Layer followed by MP layer
#Now two Bidirectional LSTM
with tf.variable_scope("cell_def_1"):
f_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
b_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
with tf.variable_scope("cell_op_1") as opscope:
outputs,_=tf.nn.bidirectional_dynamic_rnn(f_cell,b_cell,conv_reshape,sequence_length=seq_len,dtype=tf.float32)
merge=tf.add(outputs[0],outputs[1])
with tf.variable_scope("cell_def_2"):
f1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
b1_cell=tf.nn.rnn_cell.LSTMCell(nb_hidden,state_is_tuple=True)
with tf.variable_scope("cell_op_2") as opscope:
outputs2,_=tf.nn.bidirectional_dynamic_rnn(f1_cell,b1_cell,merge,sequence_length=seq_len,dtype=tf.float32)
# Finally A dense Layer
# A Time Distributed dense layer
loss =tf.nn.ctc_loss(logits_reshape, y, seq_len,time_major=True)
cost = tf.reduce_mean(loss)
optimizer = tf.train.RMSPropOptimizer(lr).minimize(cost)
# Now I Created Network Saver
new_saver = tf.train.Saver(tf.global_variables())
with tf.Session(graph=thisgraph) as session:
if(sys.argv[1]=="load"):
new_loader=tf.train.Saver(tf.global_variables())
new_loader.restore(session,"Weights/model_last")
print("Previous weights loaded")
else:
init_op = tf.global_variables_initializer()
session.run(init_op)
print("New Weights Initialized")
# Now the training part
for e in range(nbepochs):
for b in range(nbbatches):
#Loading batch data into feed
batchloss,batchacc,_= session.run([cost,optimizer], feed)
totalloss=totalloss+batchloss
avgloss=totalloss/nbbatches
new_saver.save(session, "Weights/model_last")
现在,该模型使用在线手写数据进行测试。网络工作正常。但问题是,节省还原是不能正常工作的。我的CNN图层权重和偏差变量具有不同的名称。我进行了100个时期的训练(第1关)。然后我再次加载它进行第2次训练。我已经检查了权重和偏差在第2阶段加载,正如它们在第1阶段结束时保存的那样。但是在第2阶段开始时,训练损失在通过结束时的损失增加了。我在做什么错误?是由于优化器配置?任何帮助将受到高度赞赏。