Question

我有一组3 x 32 x 32图像，我已将其转换为单个TF记录文件。

当我尝试使用数据集API读取数据并查看操作的形状时，我明白了：

'image1': <tf.Tensor 'IteratorGetNext_2:0' shape=(?, ?, ?, ?) dtype=float32>

相比之下，我有另外1 x 32 x 32图像，我加载到内存并从数据集中读取。在这种情况下，数据集能够确定图像的尺寸：

image2': <tf.Tensor 'IteratorGetNext_2:2' shape=(?, 1, 32, 32) dtype=float32>

这是一个问题，因为我无法在image1上执行卷积，因为渠道维度为无。我收到这个错误：

ValueError: The channel dimension of the inputs should be defined. Found `None`.

这是一个错误还是在编码图像或解码图像时我搞砸了？

这是我用来编码和解码的代码：

class img_to_tf_record_writer:
    # based on code from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/how_tos
    # /reading_data/convert_to_records.py
    def __init__(self, images, labels, save_path):
        """

        :param images: A numpy array of images (number of images, channels, height, width) to convert to a tfrecord
        :param labels: A numpy array of labels to convert to a tfrecord
        :param save_path: A string representing the full path to save the tfrecord
        """
        self.images = images
        self.labels = labels
        self.filename = save_path

    def _int64_feature(self, value):
        return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

    def _bytes_feature(self, value):
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

    def encode(self):
        if os.path.isfile(self.filename):
            print(f'{self.filename} exists')
            return

        num_examples, depth, rows, cols = self.images.shape

        print(f'Converting to TF Record format')

        with tf.python_io.TFRecordWriter(self.filename) as writer:
            for i in range(num_examples):
                image_raw = self.images[i].tostring()
                label = int(self.labels[i])

                feature_dict = {
                    'height': self._int64_feature(rows),
                    'width': self._int64_feature(cols),
                    'depth': self._int64_feature(depth),
                    'label': self._int64_feature(label),
                    'image_raw': self._bytes_feature(image_raw)
                }
                example = tf.train.Example(
                    features=tf.train.Features(feature=feature_dict)
                )
                writer.write(example.SerializeToString())

        print(f'Finished converting to TF Record format')

    @staticmethod
    def decode(serialized_example):
        # Based on code from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/how_tos/
        # reading_data/fully_connected_reader.py
        features = tf.parse_single_example(
            serialized_example,
            features={
                'image_raw': tf.FixedLenFeature([], tf.string),
                'label': tf.FixedLenFeature([], tf.int64),
                'height': tf.FixedLenFeature([], tf.int64),
                'width': tf.FixedLenFeature([], tf.int64),
                'depth': tf.FixedLenFeature([], tf.int64)
            }
        )

        image_shape = tf.stack([features['depth'], features['height'], features['width']])
        image = tf.decode_raw(features['image_raw'], tf.float32)
        image = tf.reshape(image, image_shape)

        label = tf.cast(features['label'], tf.int32)

        return image, label

Answer 1

您从tfrecord本身获得width和height以及depth。因此，您的图像形状是动态的。它在读取一条记录后知道形状。但是在构建图形时它无法知道形状，这就是我假设你看到的错误。

在TensorFlow中，您通常使用静态图。但是这样的动态输入通常不会很实用，因为权重的数量会发生变化。

我不知道动态图像形状的原因。通常，TFRecords中的图像形状将与模型匹配。最佳选择可能是在将图像提供给模型之前调整图像大小。您可能必须就要使用的频道数达成一致。

为什么TensorFlow的数据集API不知道TFRecord的维度？

1 个答案: