我使用以下博客进行了使用CNN的字符识别。 http://ankivil.com/kaggle-first-steps-with-julia-chars74k-first-place-using-convolutional-neural-networks
我做的唯一改变是dim_ordering =“th”最新的keras兼容性。
model = Sequential()
model.add(Convolution2D(128, 3, 3, border_mode='same', init='he_normal', activation = 'relu', input_shape=(1, img_rows, img_cols)))
print model.output_shape
model.add(Convolution2D(128, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))
print model.output_shape
model.add(Convolution2D(256, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(Convolution2D(256, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))
print model.output_shape
model.add(Convolution2D(512, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(Convolution2D(512, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(Convolution2D(512, 3, 3, border_mode='same', init='he_normal', activation = 'relu'))
print model.output_shape
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))
print model.output_shape
model.add(Flatten())
print model.output_shape
model.add(Dense(4096, init='he_normal', activation = 'relu'))
print model.output_shape
model.add(Dropout(0.5))
print model.output_shape
model.add(Dense(4096, init='he_normal', activation = 'relu'))
print model.output_shape
model.add(Dropout(0.5))
print model.output_shape
model.add(Dense(nb_classes, init='he_normal', activation = 'softmax'))
print model.output_shape
在50-60次迭代后,我得到的精度非常差,为0.07,并且卡在那里。
你能为我提一些建议吗?我对使用CNN执行OCR的其他模型持开放态度。
谢谢, 希瓦