使用tf.data拆分张量流数据集

时间:2019-02-18 13:47:57

标签: tensorflow split dataset

我想将图像数据集分为traintestvalidation。我正在使用tf.data API,但是我不知道如何将分割后的数据集与tf.data一起使用?

from random import shuffle
import glob
import cv2
import numpy as np
import sys
shuffle_data = True  # shuffle the addresses before saving
data_root = 'I:/pattern/Data2/train'
# read addresses and labels from the 'train' folder
addrs = glob.glob(data_root)

# to shuffle data
if shuffle_data:
    c = list(zip(addrs,label_to_index ))
    shuffle(c)
    addrs, label_to_index  = zip(*c)

# Divide the hata into 60% train, 20% validation, and 20% test
train_addres = addrs[0:int(0.6 * len(addrs))]
train_labels = label_to_index[0:int(0.6 * len(label_to_index))]
val_addrs = addrs[int(0.6 * len(addrs)):int(0.8 * len(addrs))]
val_labels = label_to_index[int(0.6 * len(addrs)):int(0.8 * len(addrs))]
test_addrs = addrs[int(0.8 * len(addrs)):]
test_labels = label_to_index[int(0.8 * len(label_to_index)):]

我使用此链接中的代码来构建tf.data数据集: https://www.tensorflow.org/tutorials/load_data/images

0 个答案:

没有答案