我正在研究一种多标签分类模型,其中我尝试使用Keras将CNN和文本分类器这两个模型组合为一个模型,并将它们一起训练,就像这样:
#cnn_model is a vgg16 model
#text_model looks as follows:
### takes the vectorized text as input
text_model = Sequential()
text_model .add(Dense(vec_size, input_shape=(vec_size,), name='aux_input'))
## merging both models
merged = Merge([cnn_model, text_model], mode='concat')
### final_model takes the combined models and adds a sofmax classifier to it
final_model = Sequential()
final_model.add(merged)
final_model.add(Dense(n_classes, activation='softmax'))
因此,我正在与ImageDataGenerator一起处理图像和相应的标签。
对于图像,我正在使用自定义帮助程序功能,该功能通过pandas数据框提供的路径将图像读取到模型中-一个用于训练(df_train),另一个用于验证(df_validation)。数据框还在“ label_vec”列中为模型提供了最终标签:
# From https://github.com/keras-team/keras/issues/5152
def flow_from_dataframe(img_data_gen, in_df, path_col, y_col, **dflow_args):
base_dir = os.path.dirname(in_df[path_col].values[0])
print('## Ignore next message from keras, values are replaced anyways')
df_gen = img_data_gen.flow_from_directory(base_dir, class_mode = 'sparse', **dflow_args)
df_gen.filenames = in_df[path_col].values
df_gen.classes = numpy.stack(in_df[y_col].values)
df_gen.samples = in_df.shape[0]
df_gen.n = in_df.shape[0]
df_gen._set_index_array()
df_gen.directory = '' # since we have the full path
print('Reinserting dataframe: {} images'.format(in_df.shape[0]))
return df_gen
from keras.applications.vgg16 import preprocess_input
train_datagen = keras.preprocessing.image.ImageDataGenerator(preprocessing_function=preprocess_input) horizontal_flip=True)
validation_datagen = keras.preprocessing.image.ImageDataGenerator(preprocessing_function=preprocess_input)#rescale=1./255)
train_generator = flow_from_dataframe(train_datagen, df_train,
path_col = 'filename',
y_col = 'label_vec',
target_size=(224, 224), batch_size=128, shuffle=False)
validation_generator = flow_from_dataframe(validation_datagen, df_validation,
path_col = 'filename',
y_col = 'label_vec',
target_size=(224, 224), batch_size=64, shuffle=False)
现在,我尝试向模型提供我的一键编码文本向量(即[0,0,0,1,0,0]
),这些向量也存储在pandas数据框中。
因为我的train_generator为我提供了图像和标签数据,所以我现在正在寻找一种解决方案,以将该生成器与生成器结合使用,从而使我能够另外输入相应的文本向量
答案 0 :(得分:2)
您可能要考虑编写自己的生成器(利用Keras的Sequence
对象进行多处理),而不是修改ImageDataGenerator
代码。来自Keras文档:
class CIFAR10Sequence(Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return int(np.ceil(len(self.x) / float(self.batch_size)))
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
return np.array([
resize(imread(file_name), (200, 200))
for file_name in batch_x]), np.array(batch_y)
您可以在一个熊猫数据框中拥有标签,图像的路径以及文本文件的路径,并从上方修改__getitem__
方法,以使生成器同时生成所有这三个方法: numpy数组X
包含所有输入,一个numpy数组Y
包含输出。