卷积网络模型不适用于非常简单的数据

时间:2019-08-22 09:05:57

标签: python numpy tensorflow keras

我生成了一个非常简单的数据集。基本上,是一列3x3图像,具有一列白色像素或一排白色像素。

netlib implementation a1ef2d587439b21c2c.th.png a278cffc11d8908c38.th.png a35516d69bdf82bab3.th.png b19abb214427ef5ce1.th.png b2988962f9bc3ce754.th.png

我正在尝试训练卷积网络模型,以区分行白色像素或一列白色像素的图像。希望在训练结束时,两个2x2 conv滤波器(在conv层中没有偏差)的滤波器将进行某种垂直/水平边缘检测。

这是我的体系结构:

  

Conv2d-> Relu-> Flatten-> Dense-> Softmax

class MyNet:
    @staticmethod
    def build(width, height, depth, classes):
        # initialize the model along with the input shape to be
        # 'channels last' and the channels dimension itself
        model = Sequential()
        inputShape = (height, width, depth)
        chanDim = -1

        # if we are using "channels first", update the input shape
        # and channels dimension
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)
            chanDim = 1

        # add some layers
        # CONV => RELU
        model.add(Conv2D(filters=2, kernel_size=(2,2), padding="valid",
                          use_bias=False ,input_shape=inputShape))
        model.add(Activation("relu"))
        # finally, we have a fully connected dense layer  layers and a 
        # softmax classifier:
        model.add(Flatten())
        # softmax classifier
        model.add(Dense(classes))
        model.add(Activation("softmax"))

        # return the constructed network architecture
        return model

*我正在使用softmax层作为该实验的输出。

我设置了以下参数:

# initialize the number of epochs to train for, initial learning rate,
# batch size, and image dimensions
np.random.seed(48)
EPOCHS = 100
INIT_LR = 1e-3
BS = 6
IMAGE_DIMS = (3, 3, 1)

model = MyNet.build(3,3,1,2) # create model

# compile model
model.compile(optimizer= keras.optimizers.SGD(lr=INIT_LR),
              loss= tf.nn.softmax_cross_entropy_with_logits_v2,
              metrics=['accuracy'])

# Finally, train model
model.fit(X_train, labels, 
          batch_size=BS,
          epochs=EPOCHS,
          verbose=1
         )

相当直接的分类任务,但是由于某种原因(使我不高兴了),该模型无法学习下蹲!

...
Epoch 96/100
6/6 [==============================] - 0s 10ms/step - loss: 0.8133 - acc: 0.5000
Epoch 97/100
6/6 [==============================] - 0s 5ms/step - loss: 0.8133 - acc: 0.5000
Epoch 98/100
6/6 [==============================] - 0s 4ms/step - loss: 0.8133 - acc: 0.5000
Epoch 99/100
6/6 [==============================] - 0s 4ms/step - loss: 0.8133 - acc: 0.5000
Epoch 100/100
6/6 [==============================] - 0s 5ms/step - loss: 0.8133 - acc: 0.5000

我尝试更改学习率,时期数,优化器,损失函数等,但还是没有。

将随机种子设置为48,我得到以下2个过滤器,这些过滤器在训练后仍然保持不变:(,所以根本没有学习。

model.get_weights()[0][:,:,0,0]

> array([[ 0.23327285, -0.06844258],
>       [ 0.61764306, -0.46921107]], dtype=float32)

--------------------------------------
model.get_weights()[0][:,:,0,1]
> array([[ 0.68603402,  0.13425004],
>       [ 0.39129239,  0.23840684]], dtype=float32)


顺便说一句,我在从头开始使用相同架构(全部为numpy)创建的一个conv网络上运行了该实验,效果更好。

数据设置如下:



image_a  = np.array([
    [0, 0, 0],
    [255, 255, 255],
    [0, 0, 0],
], dtype="uint8")

image_a_2  = np.array([
    [255, 255, 255],
    [0, 0, 0],
    [0, 0, 0],
], dtype="uint8")

image_a_3  = np.array([
    [0, 0, 0],
    [0, 0, 0],
    [255, 255, 255],
], dtype="uint8")

image_b  = np.array([
    [0, 255, 0],
    [0, 255, 0],
    [0, 255, 0],
], dtype="uint8")

image_b_2  = np.array([
    [255, 0, 0],
    [255, 0, 0],
    [255, 0, 0],
], dtype="uint8")

image_b_3  = np.array([
    [0, 0,  255],
    [0, 0,  255],
    [0, 0,  255],
], dtype="uint8")

X_train = np.array([image_a,image_a_2, image_a_3, image_b, image_b_2, image_b_3])

# set up in correct shape for tensorflow (batch, height, width, channel)
X_train = X_train.reshape(X_train.shape[0], 3, 3, 1)

y = [0,0,0,1,1,1] # images with row of white pixels->0, and col of white->1
labels = np.eye(2, )[y]


所以,我要去哪里错了。

2 个答案:

答案 0 :(得分:4)

您正在使用tf.nn.softmax_cross_entropy_with_logits_v2损失函数,该函数需要logit输入,请参见documentation(无标度logits,因为它在优化损失计算中使用softmax),因此您不应该拥有{{1 }}在softmax的输出处激活。如果您不想因为推理而删除MyNet的输出上的softmax激活,则可以使用MyNet损失函数,而改为tf.keras.losses.categorical_crossentropy


编辑:

另一个问题是您的Conv2D图层仅使用2个过滤器。因此,只有两组权重才能学习6种情况。这似乎是一个问题,因为当将其更改为6(与padding ='same'一起使用)时,该模型将提供100%的准确性。

我还将优化器更改为adam,因为我对该优化器的经验更好。然后将数据缩放到0到1之间。 请参见下面的代码以获取工作示例。

from_logits=False

产生输出

import tensorflow as tf
import tensorflow.keras.backend as K
import numpy as np
class MyNet:
    @staticmethod
    def build(width, height, depth, classes):
        # initialize the model along with the input shape to be
        # 'channels last' and the channels dimension itself
        model = tf.keras.models.Sequential()
        inputShape = (height, width, depth)

        # if we are using "channels first", update the input shape
        # and channels dimension
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)
        # add some layers
        # CONV => RELU
        model.add(tf.keras.layers.Conv2D(filters=6, kernel_size=(2,2), padding="same",
                          use_bias=False ,input_shape=inputShape))
        model.add(tf.keras.layers.Activation("relu"))
        # finally, we have a fully connected dense layer  layers and a 
        # softmax classifier:
        model.add(tf.keras.layers.Flatten())
        # softmax classifier
        model.add(tf.keras.layers.Dense(classes))
        model.add(tf.keras.layers.Activation("softmax"))

        # return the constructed network architecture
        return model

image_a  = np.array([
    [0, 0, 0],
    [255, 255, 255],
    [0, 0, 0],
], dtype="uint8")

image_a_2  = np.array([
    [255, 255, 255],
    [0, 0, 0],
    [0, 0, 0],
], dtype="uint8")

image_a_3  = np.array([
    [0, 0, 0],
    [0, 0, 0],
    [255, 255, 255],
], dtype="uint8")

image_b  = np.array([
    [0, 255, 0],
    [0, 255, 0],
    [0, 255, 0],
], dtype="uint8")

image_b_2  = np.array([
    [255, 0, 0],
    [255, 0, 0],
    [255, 0, 0],
], dtype="uint8")

image_b_3  = np.array([
    [0, 0,  255],
    [0, 0,  255],
    [0, 0,  255],
], dtype="uint8")
X_train = np.array([image_a,image_a_2, image_a_3, image_b, image_b_2, image_b_3])

# set up in correct shape for tensorflow (batch, height, width, channel)
X_train = X_train.reshape(X_train.shape[0], 3, 3, 1)/255

y = [0,0,0,1,1,1] # images with row of white pixels->0, and col of white->1
labels = np.eye(2, )[y]



# initialize the number of epochs to train for, initial learning rate,
# batch size, and image dimensions
tf.set_random_seed(48)
EPOCHS = 100
BS = 6
IMAGE_DIMS = (3, 3, 1)

model = MyNet.build(3,3,1,2) # create model

# compile model
model.compile(optimizer= tf.keras.optimizers.Adam(),
              loss= tf.keras.losses.categorical_crossentropy,
              metrics=['accuracy'])

# Finally, train model
model.fit(X_train, labels, 
          batch_size=BS,
          epochs=EPOCHS,
          verbose=1
         )
pred = model.predict(X_train)
pred_labels = (pred>0.5)
print('All predictions equal to labels, ' + str(np.all(pred_labels ==labels)))


答案 1 :(得分:2)

您应该将数据标准化为[0,1]范围。如果范围为[0,255],则网络可能无法学习。 只需将数据集除以255。 此外,我将使用亚当优化器和二进制交叉熵作为损失。