Question

要对图像进行分类，我们使用了一个神经网络，其中包含一些卷积层，然后是一些完全连接的层。

元数据包含一些有助于分类图像的数字信息。是否有一种简单的方法将数字元数据以及卷积的输出一起输入到第一个完全连接的层中？是否可以使用TensorFlow甚至更好的Keras来实现这一点？

Answer 1

您可以在另一个分支中处理数值数据，然后将结果与CNN分支合并，然后将合并的张量传递到几个最终的密集层。这是该解决方案的一般示意图：

# process image data using conv layers
inp_img = Input(shape=...)
# ...

# process numerical data
inp_num = Input(shape=...)
x = Dense(...)(inp_num)
out_num = Dense(...)(x)

# merge the result with a merge layer such as concatenation
merged = concatenate([out_conv, out_num])
# the rest of the network ...

out = Dense(num_classes, activation='softmax')(...)

# create the model
model = Model([inp_img, inp_num], out)

当然，要构建这样的模型，您需要使用Keras Functional API。因此，强烈建议您阅读official guide。

Answer 2

有没有一种简单的方法可以将数字元数据输入到第一个完全连接的层以及卷积的输出？

是的，有可能。对于数字元数据和图像，需要两个输入。

inp1 = Input(28,28,1) # image
inp2 = Input(30,) # numerical metadata (assume size of numerical feature is 30)

conv2d = Convolution2D(100,strides=1,padding='same')(inp1)
embedding = Embedding(1000)(inp2)

# ... rest of the network
prev_layer = Concatenation(axis=-1)[feature_image, feature_metadata]            
prediction = Dense(100)(prev_layer)

model = Model(inputs=[inp1, inp2], outputs=prediction)

请参阅keras here中的完整示例。

使用图像和数字输入的神经网络

2 个答案: