使用张量流处理从图像数据集生成的大块的替代方法

时间:2018-11-02 09:29:17

标签: python tensorflow deep-learning

步骤1: 我有一个包含644个大图像的tf记录。 我必须解码tf记录中的每个图像, 将它们转换为大小为50的补丁

步骤2: 一次,我为所有4644张图像创建了补丁。我必须将创建的补丁写回到另一个tf记录,以后我将用它来训练网络。

我不知道如何有效地执行步骤1和2。我已经为第1步和第2步编写了代码,但是由于我有从4644张图像生成的总大小为[369969,50,50,3]的补丁,所以系统挂起了。

这是我编写的代码

For decoding each image and convert to patches of size 50x50x3

dataset = (tf.data.TFRecordDataset('Image.tfrecords')
           .map(read_and_decode, num_parallel_calls=num_parallel_calls )  # step 2
           .map( get_patches_fn, num_parallel_calls=num_parallel_calls )  # step 3
           .apply( tf.data.experimental.unbatch())  # unbatch the patches we just produced
           .prefetch( 1 )  # step 6: make sure you always have one batch ready to serve
           )

iterator = dataset.make_one_shot_iterator()
patches_image = iterator.get_next()

init=tf.global_variables_initializer()

temp_patches_image=[]
image_batch_size=369969
with tf.Session() as sess:
    sess.run(init)
    for i in range(0,image_batch_size):
      res=sess.run(patches_image)
      temp_patches_image.append(res)
    temp_patches_image=tf.stack(temp_patches_image)-------> This step 
                                                            causes my 
                                                            system to 
                                                            hang
    final_patches_image=sess.run(temp_patches_image)


Step 2:
 For writing the patches created to another tf-record

 create_data_record_patch(final_patches_image,'patches.tfrecord')


    def create_data_record_patch(patches, tfrecords_filename ):
        # Function to write all the filenames to a tf-record
        #   Input: file_names - list of filenames to be converted to tf record
        #          tf-records_filename- name of tf-file-record

        writer = tf.python_io.TFRecordWriter( tfrecords_filename )
        total_batch_size=0
        for i in range(patches.shape[0]):

            height = patches.shape[1]
            width = patches.shape[2]

            img_raw = patches[i,:,:,:].tostring()
            example = tf.train.Example( features=tf.train.Features( feature={
                'height': _int64_feature( height ),
                'width': _int64_feature(width ),
                'image_raw': _bytes_feature(img_raw)}))

            writer.write( example.SerializeToString() )

        writer.close()
        return total_batch_size

其他信息:当我有50张图像(生成了约5000个补丁)而不是4644张图像的tf记录时,以上代码有效。我知道tf.stack会引起问题,因为它必须处理所有369969补丁,但是我不知道有其他方法可以为所有生成的补丁创建单个tf记录。

感谢您的帮助。

谢谢。

0 个答案:

没有答案