我有大约8 G的tfrecord文件。我想把它分成4个文件,每个文件大约2个G.我怎么能直接这样做?我可以在tensorflow中这样做吗?是否有任何应用程序来分割tfrecord数据?
答案 0 :(得分:0)
我不知道如何指定tfrecord文件的结果大小。但是,您当然可以限制tfrecord文件中的功能数量。知道这并不是你要求的,它可以完成同样的工作。
(fragment_size
是一个tfrecord文件中的要素数量)
for video_count in range((num_videos)):
if video_count % fragment_size == 0:
if writer is not None:
writer.close()
filename = os.path.join(destination_path, name + str(
current_batch_number) + '_of_' + str(
total_batch_number) + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for image_count in range(num_images):
path = 'blob' + '/' + str(image_count)
image = data[video_count, image_count, :, :, :]
image = image.astype(color_depth)
image_raw = image.tostring()
feature[path] = _bytes_feature(image_raw)
feature['height'] = _int64_feature(height)
feature['width'] = _int64_feature(width)
feature['depth'] = _int64_feature(num_channels)
example = tf.train.Example(features=tf.train.Features(feature=feature))
writer.write(example.SerializeToString())
if writer is not None:
writer.close()