克拉斯的CTC损失没有减少

时间:2018-01-30 10:36:18

标签: python deep-learning keras handwriting-recognition

我正在使用Keras与theano后端进行在线手写识别问题,如本文所述:http://papers.nips.cc/paper/3213-unconstrained-on-line-handwriting-recognition-with-recurrent-neural-networks.pdf

我遵循Keras图像ocr示例https://github.com/keras-team/keras/blob/master/examples/image_ocr.py并修改了在线手写样本的代码而不是图像样本。在使用842个文本行的数据集训练200个历元,每个历元需要约6分钟时,CTC对数减少在第一个历元之后减少但对于所有剩余历元保持不变。我尝试过不同的优化器(sgd,adam,adadelta)和学习率(0.01,0.1,0.2),但几乎没有任何损失变化。

x_train.shape =(842,1263,4)[842个文字行,4维中有1263个笔划点]

y_train.shape =(842,64)[842条文字行,每行64个max_len字符]

标签类型(len_alphabet)= 66

代码快照:

size=x_train.shape[0]
trainable=True
inputs = Input(name='the_input', shape=x_train.shape[1:], dtype='float32')
rnn_encoded = Bidirectional(GRU(64, return_sequences=True),
                            name='bidirectional_1',
                            merge_mode='concat',trainable=trainable)(inputs)
birnn_encoded = Bidirectional(GRU(64, return_sequences=True),
                            name='bidirectional_2',
                            merge_mode='concat',trainable=trainable)(rnn_encoded)

output = TimeDistributed(Dense(66, activation='softmax'))(birnn_encoded)
y_pred = Activation('softmax', name='softmax')(output)
labels = Input(name='the_labels', shape=[max_len], dtype='int32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred,labels, input_length, label_length])
model = Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer='Adadelta')
absolute_max_string_len=max_len
blank_label=len(alphabet)+1
labels = np.ones([size, absolute_max_string_len])
input_length = np.zeros([size, 1])
label_length = np.zeros([size, 1])
source_str = []
for i in range (x_train.shape[0]):
    labels[i, :] = y_train[i]
    input_length[i] = x_train.shape[1]
    label_length[i] =len(y_train[i])
    source_str.append('')
inputs_again = {'the_input': x_train,
              'the_labels': labels,
              'input_length': input_length,
              'label_length': label_length,
              'source_str': source_str  # used for visualization only
              }
outputs = {'ctc': np.zeros([size])} 
model.fit(inputs_again, outputs, epochs=200,batch_size=25)

我的完整代码在此处托管:https://github.com/aayushee/HWR/blob/master/Run/CTC.py 这些是模型和培训的屏幕截图: https://github.com/aayushee/HWR/blob/master/Run/model.png https://github.com/aayushee/HWR/blob/master/Run/epochs.png

请建议是否需要修改模型体系结构,其他一些优化器会更好地解决此问题,或者是否还有其他一些可以解决问题的方法。 谢谢!

0 个答案:

没有答案