Question

我发现数据集的中间层绝对没有任何特征变化，如下所示：

Google ML Crash Course中的示例（猫和狗）在其中显示了CNN提取的各种功能的成果，如下所示：

我相信正是由于这个原因，我的数据集上的CNN的准确度非常低，仅为6％左右……

我想知道是否有人知道为什么吗？

我认为这不是我的数据集的问题（14个类，每个类的amount_of_images = 100-450，训练集中总共有约2500张图像），所以我将跳过有关数据集的详细信息在这篇文章中。我认为这更多是与CNN相关的问题...？

我的CNN架构设计如下：

# Our input feature map is 150x150x3: 150x150 for the image pixels, and 3 for
# the three color channels: R, G, and B
img_input = layers.Input(shape=(150, 150, 3))
print(img_input)

# First convolution extracts 16 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(16, 3, activation='relu', strides=(1,1), padding='same')(img_input)
# x = layers.MaxPooling2D(2)(x)
print(x)

# Second convolution extracts 32 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(32, 3, activation='relu',strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)

# Third convolution extracts 64 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(64, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)

# forth convolution extracts 128 filters that are 3x3
# Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(128, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)

#fifth convolution extracts 256 filters that are 3x3
#Convolution is followed by max-pooling layer with a 2x2 window
x = layers.Conv2D(256, 3, activation='relu', strides=(2,2), padding='same')(x)
# x = layers.MaxPooling2D(2)(x)
print(x)

可以看作是以下内容：

Tensor("input_1:0", shape=(?, 150, 150, 3), dtype=float32)
Tensor("conv2d/Relu:0", shape=(?, 150, 150, 16), dtype=float32)
Tensor("conv2d_1/Relu:0", shape=(?, 75, 75, 32), dtype=float32)
Tensor("conv2d_2/Relu:0", shape=(?, 38, 38, 64), dtype=float32)
Tensor("conv2d_3/Relu:0", shape=(?, 19, 19, 128), dtype=float32)
Tensor("conv2d_4/Relu:0", shape=(?, 10, 10, 256), dtype=float32)

然后

# Flatten feature map to a 1-dim tensor so we can add fully connected layers
x = layers.Flatten()(x)

# Create a fully connected layer with ReLU activation and 512 hidden units
x = layers.Dense(512, activation='relu')(x)

# Create output layer with a single node and sigmoid activation
output = layers.Dense(1, activation='sigmoid')(x)

# Create model:
# input = input feature map
# output = input feature map + stacked convolution/maxpooling layers + fully 
# connected layer + sigmoid output layer
model = Model(img_input, output)

这些是每个时期的结果：

Epoch 1/15
126/126 - 92s - loss: -4.6284e+13 - acc: 0.0647 - val_loss: -3.0235e+14 - val_acc: 0.0615
Epoch 2/15
126/126 - 50s - loss: -3.6250e+15 - acc: 0.0639 - val_loss: -1.2503e+16 - val_acc: 0.0615
Epoch 3/15
126/126 - 50s - loss: -4.7133e+16 - acc: 0.0639 - val_loss: -1.2015e+17 - val_acc: 0.0615
Epoch 4/15
126/126 - 50s - loss: -3.0991e+17 - acc: 0.0639 - val_loss: -6.4998e+17 - val_acc: 0.0615
Epoch 5/15
126/126 - 51s - loss: -1.3102e+18 - acc: 0.0639 - val_loss: -2.4530e+18 - val_acc: 0.0615
Epoch 6/15
126/126 - 50s - loss: -4.2291e+18 - acc: 0.0639 - val_loss: -7.3530e+18 - val_acc: 0.0615
Epoch 7/15
126/126 - 50s - loss: -1.1655e+19 - acc: 0.0639 - val_loss: -1.8978e+19 - val_acc: 0.0615
Epoch 8/15
126/126 - 50s - loss: -2.8185e+19 - acc: 0.0639 - val_loss: -4.3459e+19 - val_acc: 0.0615
Epoch 9/15
126/126 - 50s - loss: -6.0396e+19 - acc: 0.0639 - val_loss: -9.0798e+19 - val_acc: 0.0615
Epoch 10/15
126/126 - 51s - loss: -1.2250e+20 - acc: 0.0639 - val_loss: -1.7633e+20 - val_acc: 0.0615
Epoch 11/15
126/126 - 49s - loss: -2.2829e+20 - acc: 0.0639 - val_loss: -3.2223e+20 - val_acc: 0.0615
Epoch 12/15
126/126 - 50s - loss: -4.0790e+20 - acc: 0.0639 - val_loss: -5.5966e+20 - val_acc: 0.0615
Epoch 13/15
126/126 - 51s - loss: -6.9094e+20 - acc: 0.0639 - val_loss: -9.3551e+20 - val_acc: 0.0615
Epoch 14/15
126/126 - 50s - loss: -1.1305e+21 - acc: 0.0639 - val_loss: -1.5039e+21 - val_acc: 0.0615
Epoch 15/15
126/126 - 50s - loss: -1.7871e+21 - acc: 0.0639 - val_loss: -2.3466e+21 - val_acc: 0.0615

我还将在此处附上两个图表，以说明损失和准确性：

基本上是Google ML Crash Course之一的模型，我只是在这里和那里更改了参数，然后应用我自己的数据来查看会发生什么。

我的理论是，由于未知原因，CNN并未从我的数据集中提取任何特征，因此该网络仅在2个时期内找到了局部最小值。

在我努力解决这个问题时，任何帮助将不胜感激。

Answer 1

如果您有14个班级，这是没有道理的：

# Create output layer with a single node and sigmoid activation
output = layers.Dense(1, activation='sigmoid')(x)

您应该对14个神经元使用softmax激活：

output = layers.Dense(14, activation='softmax')(x)

然后，您应该确保使用正确的损耗（分类或稀疏交叉熵），并在损耗需要时对标签进行一次正确编码。

Keras CNN中级级别没有功能更改

1 个答案: