我发现数据集的中间层绝对没有任何特征变化,如下所示:
Google ML Crash Course中的示例(猫和狗)在其中显示了CNN提取的各种功能的成果,如下所示:
我相信正是由于这个原因,我的数据集上的CNN的准确度非常低,仅为6%左右……
我想知道是否有人知道为什么吗?
我认为这不是我的数据集的问题(14个类,每个类的amount_of_images = 100-450,训练集中总共有约2500张图像),所以我将跳过有关数据集的详细信息在这篇文章中。我认为这更多是与CNN相关的问题...?
我的CNN架构设计如下:
# Our input feature map is 150x150x3: 150x150 for the image pixels, and 3 for
# the three color channels: R, G, and B
img_input = layers.Input(shape=(150, 150, 3))
print(img_input)
# First convolution extracts 16 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(16, 3, activation='relu', strides=(1,1), padding='same')(img_input)
# x = layers.MaxPooling2D(2)(x)
print(x)
# Second convolution extracts 32 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(32, 3, activation='relu',strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)
# Third convolution extracts 64 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(64, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)
# forth convolution extracts 128 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(128, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)
#fifth convolution extracts 256 filters that are 3x3
#Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(256, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)
可以看作是以下内容:
Tensor("input_1:0", shape=(?, 150, 150, 3), dtype=float32)
Tensor("conv2d/Relu:0", shape=(?, 150, 150, 16), dtype=float32)
Tensor("conv2d_1/Relu:0", shape=(?, 75, 75, 32), dtype=float32)
Tensor("conv2d_2/Relu:0", shape=(?, 38, 38, 64), dtype=float32)
Tensor("conv2d_3/Relu:0", shape=(?, 19, 19, 128), dtype=float32)
Tensor("conv2d_4/Relu:0", shape=(?, 10, 10, 256), dtype=float32)
然后
# Flatten feature map to a 1-dim tensor so we can add fully connected layers
x = layers.Flatten()(x)
# Create a fully connected layer with ReLU activation and 512 hidden units
x = layers.Dense(512, activation='relu')(x)
# Create output layer with a single node and sigmoid activation
output = layers.Dense(1, activation='sigmoid')(x)
# Create model:
# input = input feature map
# output = input feature map + stacked convolution/maxpooling layers + fully
# connected layer + sigmoid output layer
model = Model(img_input, output)
这些是每个时期的结果:
Epoch 1/15
126/126 - 92s - loss: -4.6284e+13 - acc: 0.0647 - val_loss: -3.0235e+14 - val_acc: 0.0615
Epoch 2/15
126/126 - 50s - loss: -3.6250e+15 - acc: 0.0639 - val_loss: -1.2503e+16 - val_acc: 0.0615
Epoch 3/15
126/126 - 50s - loss: -4.7133e+16 - acc: 0.0639 - val_loss: -1.2015e+17 - val_acc: 0.0615
Epoch 4/15
126/126 - 50s - loss: -3.0991e+17 - acc: 0.0639 - val_loss: -6.4998e+17 - val_acc: 0.0615
Epoch 5/15
126/126 - 51s - loss: -1.3102e+18 - acc: 0.0639 - val_loss: -2.4530e+18 - val_acc: 0.0615
Epoch 6/15
126/126 - 50s - loss: -4.2291e+18 - acc: 0.0639 - val_loss: -7.3530e+18 - val_acc: 0.0615
Epoch 7/15
126/126 - 50s - loss: -1.1655e+19 - acc: 0.0639 - val_loss: -1.8978e+19 - val_acc: 0.0615
Epoch 8/15
126/126 - 50s - loss: -2.8185e+19 - acc: 0.0639 - val_loss: -4.3459e+19 - val_acc: 0.0615
Epoch 9/15
126/126 - 50s - loss: -6.0396e+19 - acc: 0.0639 - val_loss: -9.0798e+19 - val_acc: 0.0615
Epoch 10/15
126/126 - 51s - loss: -1.2250e+20 - acc: 0.0639 - val_loss: -1.7633e+20 - val_acc: 0.0615
Epoch 11/15
126/126 - 49s - loss: -2.2829e+20 - acc: 0.0639 - val_loss: -3.2223e+20 - val_acc: 0.0615
Epoch 12/15
126/126 - 50s - loss: -4.0790e+20 - acc: 0.0639 - val_loss: -5.5966e+20 - val_acc: 0.0615
Epoch 13/15
126/126 - 51s - loss: -6.9094e+20 - acc: 0.0639 - val_loss: -9.3551e+20 - val_acc: 0.0615
Epoch 14/15
126/126 - 50s - loss: -1.1305e+21 - acc: 0.0639 - val_loss: -1.5039e+21 - val_acc: 0.0615
Epoch 15/15
126/126 - 50s - loss: -1.7871e+21 - acc: 0.0639 - val_loss: -2.3466e+21 - val_acc: 0.0615
我还将在此处附上两个图表,以说明损失和准确性:
基本上是Google ML Crash Course之一的模型,我只是在这里和那里更改了参数,然后应用我自己的数据来查看会发生什么。
我的理论是,由于未知原因,CNN并未从我的数据集中提取任何特征,因此该网络仅在2个时期内找到了局部最小值。
在我努力解决这个问题时,任何帮助将不胜感激。
答案 0 :(得分:0)
如果您有14个班级,这是没有道理的:
# Create output layer with a single node and sigmoid activation
output = layers.Dense(1, activation='sigmoid')(x)
您应该对14个神经元使用softmax激活:
output = layers.Dense(14, activation='softmax')(x)
然后,您应该确保使用正确的损耗(分类或稀疏交叉熵),并在损耗需要时对标签进行一次正确编码。