我一直在尝试训练一个由VGG16网络和一些LSTM层组成的网络。由于我的图像很大,并且因为VGG16不能随图像大小缩放,所以我决定将图像分成小块,并训练LSTM逐段读取图像。由于我的数据集很大,因此我需要模型来批量加载数据。我已经尝试构建一个自定义数据生成器,但是我不确定我的实现是否正确,因为我不知道如何查看模型在训练过程中加载了哪些图像。如果每个时期都有不同数量的音色,我也不知道如何使之适应。
我的数据集组织如下:
data|
|---train|
|---class1|
|---image1|
|---im1_patch1.tif
|---im1_patch2.tif
...
|---im1_patch352.tif
|---image2|
|---im2_patch1.tif
|---im2_patch2.tif
...
|---im2_patch352.tif
|---class2|
|---image3|
|---im3_patch1.tif
|---im3_patch2.tif
...
|---im3_patch352.tif
|---image4|
|---im4_patch1.tif
|---im4_patch2.tif
...
|---im4_patch352.tif
如您所见,我的图像已经被分解成小块,我想分批加载它们,以使我的每张张量X都具有以下尺寸:[batch_size,n_patches,w,h,n_channels]。 batch_size是每个时期的图像数量,n_patches是每个图像的补丁数量,w和h是每个补丁的尺寸(固定),n_channels是每个补丁的通道数(固定为3)
我首先有一些问题,
我的网络体系结构如下:
vgg = VGG16(
include_top=False,
weights='imagenet',
input_shape=(224, 224, 3)
)
for layer in vgg.layers:
layer.trainable = False
model = Sequential()
model.add(TimeDistributed(vgg, input_shape=(npatches, w, h, nchannels)))
model.add(TimeDistributed(Flatten(name="flatten")))
model.add(TimeDistributed(Dense(4096, activation="relu")))
model.add(TimeDistributed(Dropout(0.5)))
model.add(TimeDistributed(Dense(4096, activation="relu")))
model.add(TimeDistributed(Dropout(0.5)))
model.add(TimeDistributed(Flatten(name="flatten")))
model.add(LSTM(264, activation='tanh',return_sequences=True))
model.add(LSTM(128, activation='tanh',return_sequences=True))
model.add(LSTM(64, activation='tanh',return_sequences=False))
model.add(Dense(64, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
最后,我尝试构建自定义图像生成器:
def LoadImages(folder, batch_size, npatches):
batch_start = 0
batch_end = n_patches-1
epoch_id = 0
files = list(paths.list_images(folder))
n_images = len(files)
StopCriteria = n_images/(batch_size*n_patches)
while True:
while epoch_id < StopCriteria:
patch_id = 0
X = np.empty([batch_size, n_patches, 224, 224, 3])
Y = np.empty([batch_size, n_patches])
for image_path in files[batch_start:batch_end]:
img = tf.keras.preprocessing.image.load_img(image_path,color_mode="rgb")
input_arr = keras.preprocessing.image.img_to_array(img)
X[epoch_id,patch_id,:,:,:] = input_arr
if image_path.split("\\")[-3].split('/')[-1] == 'long':
Y[epoch_id,patch_id] = 0
if image_path.split("\\")[-3].split('/')[-1] == 'short':
Y[epoch_id,patch_id] = 1
patch_id += 1
yield (X,Y)
batch_start += batch_size + n_patches-1
batch_end += batch_size + n_patches-1
epoch_id += 1
谢谢!