Keras损失输出不会改变-已检查激活和未完成主题

时间:2019-02-08 14:10:56

标签: python tensorflow machine-learning keras deep-learning

我目前正在为CNN挣扎。 我使用categorial_crossentropy,然后将其添加到模型中。 acc既不增加也不减少损失。 标记数据的数量现在是600,虽然很小,但是对我来说根本没有变化。

<Navitem dName={"Resume"} onClickFunction={this.props.onClickFunction.bind(this, downloadref)} />
<Navitem dName={"Certificates"} onClickFunction={this.props.onClickFunction.bind(this, certifref)} />
<Navitem dName={"Skills"} onClickFunction={this.props.onClickFunction.bind(this, skillsref)} />
<Navitem dName={"Projects"} onClickFunction={this.props.onClickFunction.bind(this, projectsref)} />
<Navitem dName={"Life"} onClickFunction={this.props.onClickFunction.bind(this, timelineref)} />
<Navitem dName={"Me"} onClickFunction={this.props.onClickFunction.bind(this, introref)} />

我的模型出问题了吗? 我尝试更改lr,图片的大小,尝试简化模型,更改内核大小,让它运行更多的时期(最多60个),并打印x_test的预测。 偏见也似乎是错误的:

### Define architecture.
model.add(Conv2D(32, 4, strides=(11,11),padding="same",input_shape=(200,200,3), activation="relu"))

model.add(Dropout(0.2))

model.add(BatchNormalization())

model.add(Conv2D(64, 4, strides=(9,9),padding="same", activation="relu"))

model.add(Dropout(0.2))

model.add(BatchNormalization())

model.add(Conv2D(128, 4, strides=(5,5),padding="same", activation="relu"))

model.add(Dropout(0.2))

model.add(BatchNormalization())

model.add(GlobalMaxPooling2D())

model.add(Dense(128, activation="relu"))

model.add(Dense(y_test.shape[1], activation="sigmoid"))

model.summary()

sgd = optimizers.SGD(lr=0.1,) #0.1
model.compile(loss='categorical_crossentropy', optimizer='sgd', 
              metrics=['accuracy'])


model1 = model.fit(x_train, y_train,batch_size=32, epochs=10, verbose=1)

Epoch 1/10
420/420 [==============================] - 5s 11ms/step - loss: 1.4598 - acc: 0.2381

Epoch 2/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4679 - acc: 0.2333

Epoch 3/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4335 - acc: 0.2667

Epoch 4/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4198 - acc: 0.2310

Epoch 5/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4038 - acc: 0.2524

Epoch 6/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4343 - acc: 0.2643

Epoch 7/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4281 - acc: 0.2786

Epoch 8/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4097 - acc: 0.2333

Epoch 9/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4071 - acc: 0.2714

Epoch 10/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4135 - acc: 0.2476


非常感谢您提供的各种帮助!谢谢!

3 个答案:

答案 0 :(得分:0)

我可以根据我的经验为您推荐几件事,供您尝试:

  • 由于您正在使用分类交叉熵,因此可以在最后一层尝试使用“ softmax”作为激活函数,而不是“ Sigmoid”。
  • 您应该降低学习率。 (此处建议使用新设置)
  • 您可以尝试使用其他优化程序,例如“ adam”而不是“ sgd”。
  • 您可以删除辍学和批处理规范化图层,并仅在必要时添加它们。
  • 将内核尺寸改为[2x2],而不是1。也许将内核大小从4更改为(3x3)。同时减小步幅大小,也许您可​​以从(1,1)开始。在步幅(11,11)为[200x200]的图像上使用大小为4的内核几乎等同于学习“一无所有”。

请先尝试最后的建议,因为这似乎是主要问题。我希望其中之一可以为您提供帮助。

答案 1 :(得分:0)

请尝试以下设置:

  • 将步幅减小到1 * 1或2 * 2或最大3 * 3
  • 删除卷积层之间的缺失,必要时仅在密集层之前使用缺失
  • 尝试在卷积层之后添加池层,最好是步幅为2 * 2,内核大小为2 * 2。
  • 将优化更改为adam / nadam
  • 使用softmax代替sigmoid
  • 增加时期数,10太少了。

以上所有要点可能会因问题而有所不同,尽管如此,您可以尝试一下并相应地修改模型。

答案 2 :(得分:0)

由于使用的步幅,您似乎丢失了前两层图像中几乎所有的空间信息。

您的model.summary()显示了问题:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 19, 19, 32)        1568      
_________________________________________________________________
dropout_1 (Dropout)          (None, 19, 19, 32)        0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 19, 19, 32)        128       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          32832     
_________________________________________________________________
dropout_2 (Dropout)          (None, 3, 3, 64)          0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 3, 3, 64)          256       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 1, 1, 128)         131200    
_________________________________________________________________
dropout_3 (Dropout)          (None, 1, 1, 128)         0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 1, 1, 128)         512       
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 183,524
Trainable params: 183,076
Non-trainable params: 448

您看到的是张量大小立即从原始图像中的200下降到第一次卷积后的19,然后在第二次卷积后下降到3。为了真正利用卷积层的优势,我们期望尺寸会逐渐减小。

如果您将代码保持原样并将所有步幅更改为(2, 2),您将获得更加合理的结构:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 100, 100, 32)      1568      
_________________________________________________________________
dropout_1 (Dropout)          (None, 100, 100, 32)      0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 100, 100, 32)      128       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 50, 50, 64)        32832     
_________________________________________________________________
dropout_2 (Dropout)          (None, 50, 50, 64)        0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 50, 50, 64)        256       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 25, 25, 128)       131200    
_________________________________________________________________
dropout_3 (Dropout)          (None, 25, 25, 128)       0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 25, 25, 128)       512       
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 183,524
Trainable params: 183,076
Non-trainable params: 448
_________________________________________________________________