如何导入自定义图像数据集以进行转移学习

时间:2020-06-27 01:36:58

标签: python tensorflow keras cnn

我有一个自定义数据集,如下所示:

├── data
│   └── train
│       ├── 1
│       │   ├── x.jpg
│       │   ├── ...
│       ├── 2
│       │   ├── 1.jpg
│       │   ├── 2.jpg
│       │   ├── 3.jpg
│       │   ├── ...
│       ├── 3
│       │   ├── ...
...     ...     ...

我想知道如何将其作为正确格式的tf.data.dataset导入,以便我能够传递此代码

for image_batch, label_batch in train_batches.take(1):
   pass

image_batch.shape

用于输入预训练的MobileNet模型。谢谢

1 个答案:

答案 0 :(得分:0)

下面的代码应该做到这一点。

import tensorflow as tf
import os

# Here, I am assuming that the directories in /data/train are the class labels
train_dir = '/data/train/'

# Find the labels
labels = os.listdir(train_dir)

# Pair up the image path and its corresponding label
label_pairs = [(os.path.join(train_dir, label, path), int(label)) for label in labels for path in os.listdir(f'{train_dir}{label}')]

# Now, separate the paths and labels.
#The step before was to make sure that the paths and labels were ordered.
train_img_paths = [pair[0] for pair in label_pairs]
train_img_labels = [pair[1] for pair in label_pairs]

# Convert the paths and labels to a tf.data.Dataset
dataset = tf.data.Dataset.from_tensor_slices((train_img_paths, train_img_labels))
相关问题