使用VGG和Keras传输学习和微调

Question

作为背景，我对机器学习的世界相对较新，我正在尝试一个目标是在NBA游戏中对游戏进行分类的项目。我输入的是NBA比赛中每场比赛的40帧序列，我的标签是给定比赛的11个无所不包的分类。

计划是采用每个帧序列并将每个帧传递到CNN以提取一组特征。然后，来自给定视频的每个特征序列将被传递到RNN。

我目前正在使用Keras进行大部分实施，我选择将VGG16模型用于CNN。以下是一些相关代码：

video = keras.Input(shape = (None, 255, 255, 3), name = 'video')
cnn = keras.applications.VGG16(include_top=False, weights = None, input_shape=
(255,255,3), pooling = 'avg', classes=11)
cnn.trainable = True

我的问题是 - 将VGG16 ConvNet的权重初始化为＆＃39; imagenet＆＃39;如果我的目标是对NBA比赛的视频片段进行分类？如果是这样，为什么？如果没有，我如何训练VGG16 ConvNet获取我自己的权重集，然后如何将它们插入到此函数中？在使用VGG16模型时，我没有找到任何有人使用自己的权重集的教程。

如果我的问题看起来很幼稚，我很抱歉，但我真的很感激有任何帮助来清理它。

Answer 1

您是否应该为您的特定任务重新训练VGG16？ 绝对不是！重新训练如此庞大的网络很难，并且需要很多直觉和知识来培训深层网络。让我们分析为什么你可以使用在ImageNet上预训练的权重来完成你的任务：

ImageNet是一个庞大的数据集，包含数百万张图像。 VGG16本身已在3-4天左右的强大GPU上接受过培训。在CPU上（假设你没有像NVIDIA GeForce Titan X那样强大的GPU）需要数周时间。
ImageNet包含来自真实世界场景的图像。 NBA比赛也可以被视为现实世界的场景。因此，ImageNet功能的预训练也很可能用于NBA比赛。

实际上，您不需要使用预先训练过的VGG16的所有卷积层。让我们看看visualization of internal VGG16 layers并查看他们检测到的内容（取自this article;图片太大，所以我只提供了一个紧凑的链接）：

第一个和第二个卷积块会查看低级要素，例如角点，边缘等。
第三和第四个卷积块查看曲面特征，曲线，圆等等。
第五层着眼于高级功能

因此，您可以决定哪种功能对您的特定任务有益。在第5街区你需要高水平的功能吗？或者您可能想要使用第3块的中级功能？也许你想在VGG底层叠加另一个神经网络？有关更多说明，请查看我编写的以下教程;它曾经是SO文档。

使用VGG和Keras传输学习和微调

在这个例子中，提出了三个简短而全面的子例子：

从 Keras 库
在VGG的任何层之上堆叠另一个网络进行培训
在其他图层的中间插入图层
使用VGG进行微调和转学习的提示和一般经验法则

加载预先训练的重量

在 Keras 中提供 ImageNet 模型的预训练，包括 VGG-16 和 VGG-19 >。在此示例中，此处和之后将使用 VGG-16 。有关详情，请访问Keras Applications documentation。

from keras import applications

# This will load the whole VGG16 network, including the top Dense layers.
# Note: by specifying the shape of top layers, input tensor shape is forced
# to be (224, 224, 3), therefore you can use it only on 224x224 images.
vgg_model = applications.VGG16(weights='imagenet', include_top=True)

# If you are only interested in convolution filters. Note that by not
# specifying the shape of top layers, the input tensor shape is (None, None, 3),
# so you can use them for any size of images.
vgg_model = applications.VGG16(weights='imagenet', include_top=False)

# If you want to specify input tensor
from keras.layers import Input
input_tensor = Input(shape=(160, 160, 3))
vgg_model = applications.VGG16(weights='imagenet',
                               include_top=False,
                               input_tensor=input_tensor)

# To see the models' architecture and layer names, run the following
vgg_model.summary()

创建一个新网络，底层取自VGG

假设对于尺寸为(160, 160, 3)的图像的某些特定任务，您希望使用预先训练的VGG底层，最多使用名称为block2_pool的图层。

vgg_model = applications.VGG16(weights='imagenet',
                               include_top=False,
                               input_shape=(160, 160, 3))

# Creating dictionary that maps layer names to the layers
layer_dict = dict([(layer.name, layer) for layer in vgg_model.layers])

# Getting output tensor of the last VGG layer that we want to include
x = layer_dict['block2_pool'].output

# Stacking a new simple convolutional network on top of it    
x = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(10, activation='softmax')(x)

# Creating new model. Please note that this is NOT a Sequential() model.
from keras.models import Model
custom_model = Model(input=vgg_model.input, output=x)

# Make sure that the pre-trained bottom layers are not trainable
for layer in custom_model.layers[:7]:
    layer.trainable = False

# Do not forget to compile it
custom_model.compile(loss='categorical_crossentropy',
                     optimizer='rmsprop',
                     metrics=['accuracy'])

删除多个图层并在中间插入一个新图层

假设您需要通过将block1_conv1和block2_conv2替换为单个卷积层来加速VGG16，以便保存预先训练的权重。我们的想法是将整个网络拆分为单独的层，然后再组装。以下是专门针对您的任务的代码：

vgg_model = applications.VGG16(include_top=True, weights='imagenet')

# Disassemble layers
layers = [l for l in vgg_model.layers]

# Defining new convolutional layer.
# Important: the number of filters should be the same!
# Note: the receiptive field of two 3x3 convolutions is 5x5.
new_conv = Conv2D(filters=64, 
                  kernel_size=(5, 5),
                  name='new_conv',
                  padding='same')(layers[0].output)

# Now stack everything back
# Note: If you are going to fine tune the model, do not forget to
#       mark other layers as un-trainable
x = new_conv
for i in range(3, len(layers)):
    layers[i].trainable = False
    x = layers[i](x)

# Final touch
result_model = Model(input=layer[0].input, output=x)

我需要Keras VGG16的预训练重量吗？

1 个答案:

使用VGG和Keras传输学习和微调

加载预先训练的重量

创建一个新网络，底层取自VGG

删除多个图层并在中间插入一个新图层