如何阅读FSNS数据集中的图像和文本?

时间:2017-07-10 04:11:54

标签: python git ocr

我只想阅读tfrecords文件中的图片和文字:FSNS datasets中的InvalidArgumentError (see above for traceback): Name: <unknown>, Feature: encoded (data type: string) is required but could not be found. [[Node: ParseSingleExample/ParseExample/ParseExample = ParseExample[Ndense=4, Nsparse=0, Tdense=[DT_STRING, DT_INT64, DT_STRING, DT_INT64], dense_shapes=[[], [], [], []], sparse_types=[], _device="/job:localhost/replica:0/task:0/cpu:0"](ParseSingleExample/ExpandDims, ParseSingleExample/ParseExample/ParseExample/names, ParseSingleExample/ParseExample/ParseExample/dense_keys_0, ParseSingleExample/ParseExample/ParseExample/dense_keys_1, ParseSingleExample/ParseExample/ParseExample/dense_keys_2, ParseSingleExample/ParseExample/ParseExample/dense_keys_3, ParseSingleExample/ParseExample/Const, ParseSingleExample/ParseExample/Const_1, ParseSingleExample/ParseExample/Const_2, ParseSingleExample/ParseExample/Const_3)]] 。 但是,当我按照Tfrecords指南中的指南link进行工作时,它会显示以下错误消息:

import tensorflow as tf
import skimage.io as io

IMAGE_HEIGHT = 384
IMAGE_WIDTH = 384

tfrecords_filename = '/home/wangjianbo_i/google_model/MyCode/models/attention_ocr/python/datasets/data/fsns/train/train-00511-of-00512'

def read_and_decode(filename_queue):

    reader = tf.TFRecordReader()

    _, serialized_example = reader.read(filename_queue)

    features = tf.parse_single_example(
      serialized_example,
      # Defaults are not specified since both keys are required.
      features={
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'encoded': tf.FixedLenFeature([], tf.string),
    'text':tf.FixedLenFeature([], tf.string)
        })

    image = tf.decode_raw(features['encoded'], tf.uint8)
    text = tf.decode_raw(features['text'], tf.uint8)

    height = tf.cast(features['height'], tf.int32)
    width = tf.cast(features['width'], tf.int32)

    image_shape = tf.stack([height, width, 3])

    image = tf.reshape(image, image_shape)

    image_size_const = tf.constant((IMAGE_HEIGHT, IMAGE_WIDTH, 3), dtype=tf.int32)

    resized_image = tf.image.resize_image_with_crop_or_pad(image=image,
                                           target_height=IMAGE_HEIGHT,
                                           target_width=IMAGE_WIDTH)

    images = tf.train.shuffle_batch( [resized_image],
                                                 batch_size=2,
                                                 capacity=30,
                                                 num_threads=2,
                                                 min_after_dequeue=10)

    return images,text


filename_queue = tf.train.string_input_producer(
    [tfrecords_filename], num_epochs=10)

image,text = read_and_decode(filename_queue)

init_op = tf.group(tf.global_variables_initializer(),
                   tf.local_variables_initializer())

with tf.Session()  as sess:

    sess.run(init_op)

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # Let's read off 3 batches just for example
    for i in xrange(3):

        img,text= sess.run([image,text])
    print img,text 
        print(img[0, :, :, :].shape) 
        print('current batch')

        io.imshow(img[0, :, :, :])
        io.show()

        io.imshow(img[1, :, :, :])
        io.show()

    coord.request_stop()
    coord.join(threads)

似乎关键名称错了?我的代码是附加的,可以作者或任何其他检查我的代码,并帮助我修复错误?

INDEX1 | Price.1 | INDEX2 | Price.2 | INDEX3 | Price.3 ..........

1 个答案:

答案 0 :(得分:0)

要阅读FSNS数据集,您可以直接使用https://dojotoolkit.org/reference-guide/1.10/dojox/grid/EnhancedGrid/plugins/Selector.html或作为参考。

您提供的代码段中的功能键不正确 - 错过了&#39; image /&#39;字首。应该是&#39;图像/编码&#39;而不是仅仅编码&#39;,&#39;图像/宽度&#39;而不是&#39;图像&#39;等等。请参阅https://github.com/tensorflow/models/blob/master/attention_ocr/python/datasets/fsns.py中的表4。