我有一个自定义数据集,如下所示:
├── data
│ └── train
│ ├── 1
│ │ ├── x.jpg
│ │ ├── ...
│ ├── 2
│ │ ├── 1.jpg
│ │ ├── 2.jpg
│ │ ├── 3.jpg
│ │ ├── ...
│ ├── 3
│ │ ├── ...
... ... ...
我想知道如何将其作为正确格式的tf.data.dataset导入,以便我能够传递此代码
for image_batch, label_batch in train_batches.take(1):
pass
image_batch.shape
用于输入预训练的MobileNet模型。谢谢
答案 0 :(得分:0)
下面的代码应该做到这一点。
import tensorflow as tf
import os
# Here, I am assuming that the directories in /data/train are the class labels
train_dir = '/data/train/'
# Find the labels
labels = os.listdir(train_dir)
# Pair up the image path and its corresponding label
label_pairs = [(os.path.join(train_dir, label, path), int(label)) for label in labels for path in os.listdir(f'{train_dir}{label}')]
# Now, separate the paths and labels.
#The step before was to make sure that the paths and labels were ordered.
train_img_paths = [pair[0] for pair in label_pairs]
train_img_labels = [pair[1] for pair in label_pairs]
# Convert the paths and labels to a tf.data.Dataset
dataset = tf.data.Dataset.from_tensor_slices((train_img_paths, train_img_labels))