I want to evaluate attention ocr model available at http://download.tensorflow.org/models/attention_ocr_2017_05_17.tar.gz with Synth 90k test set. The problem is that evaluation yields very poor results, only 0.1 character precision is reported. It seems that for every input image model output is something related to FSNS dataset:
Here is a list of input and output values when running eval.py script with this command:
python eval.py --split_name test --train_log_dir attention_ocr_2017_05_17 --dataset_name synth90k --num_batches 10
Here are some implementation details:
I have created tfrecord with 10 examples from Synth 90k test subset. Also, I have created charset_synth90k.txt file which contains character encodings (same content as fsns charset_size=134.txt).
This is my synth90k.py dataset file (including only changed lines):
DEFAULT_DATASET_DIR = os.path.join(os.path.dirname(file), 'synth90k')
DEFAULT_CONFIG = {
'name':'synth90k',
'splits': {
'test': { 'size': 10, 'pattern': 'synth90k_test*.tfrecord' }
},
'charset_filename': 'charset_synth90k.txt',
'image_shape': (31, 200, 3),
'num_of_views':1,
'max_sequence_length': 37,
'null_code': 133,
...
}
答案 0 :(得分:2)
注意力OCR模型仅使用FSNS训练数据集进行训练,并且仅适用于与法国街道名称或多或少相似的图像。为了将它应用于来自不同发行版的图像,您需要使用该发行版中的图像重新训练(或至少微调)它。