Question

我有同样的问题：重塑的输入是一个37632值的张量，但请求的形状有150528。

 writer = tf.python_io.TFRecordWriter("/home/henson/Desktop/vgg/test.tfrecords")  # 要生成的文件

for index, name in enumerate(classes):
    class_path = cwd + name +'/'
    for img_name in os.listdir(class_path):
        img_path = class_path + img_name  # 每一个图片的地址
    img = Image.open(img_path)
    img = img.resize((224, 224))
    img_raw = img.tobytes()  # 将图片转化为二进制格式
    example = tf.train.Example(features=tf.train.Features(feature={
        "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
        'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
    }))  # example对象对label和image数据进行封装
    writer.write(example.SerializeToString())  # 序列化为字符串

writer.close()


def read_and_decode(filename):  # 读入dog_train.tfrecords
    filename_queue = tf.train.string_input_producer([filename])  # 生成一个queue队列

reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)  # 返回文件名和文件
features = tf.parse_single_example(serialized_example,
                                   features={
                                       'label': tf.FixedLenFeature([], tf.int64),
                                       'img_raw': tf.FixedLenFeature([], tf.string),
                                   })  # 将image数据和label取出来

img = tf.decode_raw(features['img_raw'], tf.uint8)
img = tf.reshape(img, [224, 224, 3])  # reshape为128*128的3通道图片
img = tf.cast(img, tf.float32) * (1. / 255) - 0.5  # 在流中抛出img张量
label = tf.cast(features['label'], tf.int32)  # 在流中抛出label张量
print(img,label)
return img, label

images, labels = read_and_decode("/home/henson/Desktop/vgg/TFrecord.tfrecords")
print(images,labels)
images, labels = tf.train.shuffle_batch([images, labels], batch_size=20, capacity=16*20, min_after_dequeue=8*20)

我知道我已经将img调整为224 * 224，并重新设置为[224,224,3]，但它不起作用。我怎么能做到？

Answer 1

问题基本上与CNN架构的形状有关。可以说，我定义了图片中的架构，在int编码中，我们按照以下方式定义权重和偏差如果我们看到（权重）让我们以

开头

wc1 在这一层中，我定义了32个3x3尺寸的滤镜将被应用

wc2 在这一层中，我定义了64个3x3尺寸的滤镜将被应用

wc3 在这一层中，我定义了128个3x3尺寸的滤镜将被应用

wd1 38 * 38 * 128很有趣（来自哪里）。

在架构中，我们还定义了maxpooling概念。详见建筑图片 1.让我们解释一下假设您输入的图片为300 x 300 x 1（图片中为28x28x1） 2.（如果将步幅定义为1）每个滤镜将具有300x300x1的图片，因此在应用32x 3x3滤镜之后我们将拥有32张300x300的图片，因此收集的图片将为300x300x32

3。Maxpooling之后，如果（步幅= 2取决于您通常定义的是2），图像尺寸将从300 x 300 x 32更改为 150 x 150 x 32

（如果将步幅定义为1）现在每个滤镜将具有150x150x32的图片，因此在应用64x 3x3滤镜之后我们将拥有64张300x300的图片，因此收集的图片将为150x150x（32x64）

5。Maxpooling之后，如果（Strides = 2取决于您通常定义的是2）图像尺寸将变为150x150x（32x64）至 75 x 75 x（32x64）

（如果将步幅定义为1）现在每个滤镜将具有75 x 75 x（32x64）的图片，因此在应用64x滤镜3x3之后我们将拥有128张75 x 75 x（32x64）的图片，因此收集的图像将为75 x 75 x（32x64x128）

7。Maxpooling 后，由于图像尺寸为75x75（奇数尺寸使其均匀），因此需要先填充（如果填充定义为“相同”），则它将变为76x76（偶数）**如果（步幅= 2取决于您通常定义的是2）图像大小将从76x76x（32x64x128）更改到** 38 x 38 x（32x64x128）

现在在编码图片中看到'wd1'来了38 * 38 * 128

Answer 2

我有同样的错误，因此更改了我的代码：

image = tf.decode_raw(image_raw, tf.float32)
image = tf.reshape(image, [img_width, img_height, 3])

对此：

image = tf.decode_raw(image_raw, tf.uint8)
image = tf.reshape(image, [img_width, img_height, 3])


# The type is now uint8 but we need it to be float.
image = tf.cast(image, tf.float32)

这是因为我的generate_tf_record数据格式某种程度上不匹配。我将其序列化为字符串而不是字节列表。我注意到您和我的不同，您将图像更改为byte。这是我将图像写入tfrecord的方法。

            file_path, label = sample
            image = Image.open(file_path)
            image = image.resize((224, 224))
            image_raw = np.array(image).tostring()

            features = {
                'label': _int64_feature(class_map[label]),
                'text_label': _bytes_feature(bytes(label, encoding = 'utf-8')),
                'image': _bytes_feature(image_raw)
            }

            example = tf.train.Example(features=tf.train.Features(feature=features))
            writer.write(example.SerializeToString())

希望它会有所帮助。

Answer 3

我和您一样有同样的错误，并且找到了背后的原因。这是因为当使用.tostring（）存储图像时，数据以tf.float32的格式存储。然后，使用decode_raw（tf.uint8）对tfrecord进行解码，这会导致失配错误。我将代码更改为：

image=tf.decode_raw(image_raw,tf.float32)

或：

image=tf.image.decode_jpeg(image_raw,channels=3)

如果image_raw最初是jpeg格式

重塑的输入是一个具有37632个值的张量，但请求的形状为150528

3 个答案: