时间:2018-04-30 16:11:45

标签: python tensorflow machine-learning deep-learning pipeline



  • windows 10
  • intel core i7-6820HQ CPU | 2.70GHZ 8CPU
  • 16GB ram
  • 64
  • NVIDIA Quadro M1000M

    • 约总内存:10093 MB
    • 显示内存(VRAM):2019 MB
    • 共享内存:8073 MB
  • Tensorflow 1.8

  • Python 3.5.2

  • 图片(我有36k张图片):

    • 火车:3000 x(720x1280x3)
    • 有效:500 x(720x1280x3)
    • 测试:500 x(720x1280x3)

我的故事& strugles
首先,我想说我非常喜欢机器学习,特别是神经网络。但大多数时候,当我与Tensorflow合作时,我感觉它完全是时间背叛我。 (例如,那些释放的速度......(1.8:O))&如果我做对或错,有时甚至不知道什么? (或者我可以做得更好吗?)


因为来一个,它应该很容易,因为$ *%€k不?特别是,您是否可以将所有输入管道的90%覆盖到1,2或3个模板管道中? (我认为它是+/-可能的,(带有猫的巨型图像仍然是图像矩阵))


  • 创建一个GAN网络(GPU)(不一定要成为这个问题的一员)
  • 使用TF-estimator api(带自定义功能)
  • 使用TF记录!
  • 使用TF-dataset!


在我的第一步,我创建了一个tf记录(火车)。正如您所看到的,我遍历图像(从某个文件夹)并将所有数据写入1 tf-record。

# Helper-function for wrapping an integer so it can be saved to the TFRecords file
def wrap_int64(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

# Helper-function for wrapping raw bytes so they can be saved to the TFRecords file.

def wrap_bytes(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

#from skimage.transform import rescale, resize, downscale_local_mean

def convert(image_paths, out_path):
    # Args:
    # image_paths   List of file-paths for the images.
    # labels        Class-labels for the images.
    # out_path      File-path for the TFRecords output file.

    print("Converting: " + out_path)

    # Number of images. Used when printing the progress.
    num_images = len(image_paths)

    # Open a TFRecordWriter for the output-file.
    with tf.python_io.TFRecordWriter(out_path) as writer:

        # Iterate over all the image-paths and class-labels.
        for i, path in enumerate(image_paths):

            # Print the percentage-progress.
            print_progress(count=i+1, total=num_images)

            # Load the image-file using matplotlib's imread function.
            img_bytes_sharp = load_images(path)

            # Convert the image to raw bytes.
            img_bytes_sharp = tf.compat.as_bytes(img_bytes_sharp.tostring())

            # Create a dict with the data we want to save in the
            # TFRecords file. You can add more relevant data here.
            data = \
                    'x': wrap_bytes(img_bytes_sharp)

            # Wrap the data as TensorFlow Features.
            feature = tf.train.Features(feature=data)

            # Wrap again as a TensorFlow Example.
            example = tf.train.Example(features=feature)

            # Serialize the data.
            serialized = example.SerializeToString()

            # Write the serialized data to the TFRecords file.


  • 尺寸:6 GB
  • 3000张图片
  • 预处理:
    • RGB值介于:0和1
    • 之间
    • 类型:float32


def parse(serialized):
    # Define a dict with the data-names and types we expect to
    # find in the TFRecords file.
    # It is a bit awkward that this needs to be specified again,
    # because it could have been written in the header of the
    # TFRecords file instead.
    features = \
            'x': tf.FixedLenFeature([], tf.string)

    # Parse the serialized data so we get a dict with our data.
    parsed_example = tf.parse_single_example(serialized=serialized,

    # Decode the raw bytes so it becomes a tensor with type.
    image_x = tf.decode_raw(parsed_example['x'], tf.float32)

    # The type is now uint8 but we need it to be float.
    #image_x = tf.cast(image_x, tf.float32)

    return image_x

步骤2 + 1:加载TF记录(实际)

def input_fn(filenames, train, batch_size=32, buffer_size=2048):
    # Args:
    # filenames:   Filenames for the TFRecords files.
    # train:       Boolean whether training (True) or testing (False).
    # batch_size:  Return batches of this size.
    # buffer_size: Read buffers of this size. The random shuffling
    #              is done on the buffer, so it must be big enough.

    # Create a TensorFlow Dataset-object which has functionality
    # for reading and shuffling data from TFRecords files.
    dataset = tf.data.TFRecordDataset(filenames=filenames)

    # Parse the serialized data in the TFRecords files.
    # This returns TensorFlow tensors for the image and labels.
    dataset = dataset.map(parse)

    if train:
        # If training then read a buffer of the given size and
        # randomly shuffle it.
        dataset = dataset.shuffle(buffer_size=buffer_size)

        # Allow infinite reading of the data.
        num_repeat = None
        # If testing then don't shuffle the data.

        # Only go through the data once.
        num_repeat = 1

    # Repeat the dataset the given number of times.
    dataset = dataset.repeat(num_repeat)

    # Get a batch of data with the given size.
    dataset = dataset.batch(batch_size)

    # Create an iterator for the dataset and the above modifications.
    iterator = dataset.make_one_shot_iterator()

    # Get the next batch of images and labels.
    images_batch = iterator.get_next()

    # The input-function must return a dict wrapping the images.
    x = {'image': images_batch}

    return x

虽然,我认为上面的设置非常清楚,一旦我摆脱了mnist数据集(32x32图像),我就会收到内存问题。 (甚至不能执行2的批量大小)

例如: enter image description here

  1. 首先,如何处理内存问题?我真的可以理解我的内存问题,当TF试图存储时,整个tf记录6-7gig在其内存(显卡内存)?但我也会认为它更聪明......(它不像发电机一样工作吗?只在内存中添加x值+它们的位置)
  2. 在图片中,您会在开头看到:Dataset.list_files我对此问题是。这仅仅是1个文件,还是这意味着每个图像都是 tf.record? (我是否已经创建了3000 tf记录?)(这就是为什么我可能会遇到内存问题?)

  3. 图像返回一个数据集而不是迭代器(就像在我的代码片段中一样),当他们使用tf-estimator api时,他们可能会做任何线索(这是必要的吗?)?

  4. 这基本上就是它 基本问题是:我如何工作&使用Tensorflow | tf-records | tf-estimator on BIG 图片。 (甚至大于720p)

    https://www.youtube.com/watch?v=SxOsJPaxHME https://www.tensorflow.org/versions/master/performance/datasets_performance

  1. download a pre-trained model

  2. 将其解压缩,您可以找到预定义的管道。

