是什么使我的检测模型可以预测NaN?

时间:2020-08-05 03:35:25

标签: python tensorflow machine-learning keras deep-learning

我正在建立一个模型来检测图像中的一些信息。实际上,此问题来自Kaggle:全球小麦检测。我正在尝试制作一个受Yolo启发的模型,但这种情况下只有一个类,我想每个网格使用一个框来简化它。因此,我建立了一个返回并预测(16,16,5)张量的模型。 16,16是网格单元。 5是(c,x,y,w,h),归一化为0到1。 loss_function是按照Yolo纸的顺序进行的,但是OBJ_SCALE和NOOBJ_SCALE分别调整为0.005、0.0005,因为我猜想nans产生的原因是损失太大。实际上,当我像纸上所说的那样将它们分别设置为5和0.5时,nan会第一批出现。但这并没有太大帮助。

我该如何处理这个问题?我想念和错了什么?

def loss_function(y_true, y_pred):
    
    tf.print('\n y_pred' , y_pred[0,0,0])
    tf.print('y_true',y_true[0,0,0])
    
    OBJ_SCALE = 0.005
    NO_OBJ_SCALE = 0.0005
    
    true_conf = y_true[...,0:1]
    true_xy   = y_true[...,1:3]
    true_wh   = y_true[...,3:]
    
    pred_conf = y_pred[...,0:1]
    pred_xy   = y_pred[...,1:3]
    pred_wh   = y_pred[...,3:]

    obj_mask = tf.expand_dims(y_true[..., 0], axis = -1) * OBJ_SCALE
    noobj_mask = (1 - obj_mask) * NO_OBJ_SCALE
    
    loss_xy    = tf.reduce_sum(tf.square((true_xy - pred_xy) * obj_mask))
    loss_wh    = tf.reduce_sum(tf.square((tf.sqrt(true_wh) - tf.sqrt(pred_wh)) * obj_mask))
    loss_obj   = tf.reduce_sum(tf.square((true_conf - pred_conf) * obj_mask))
    loss_noobj = tf.reduce_sum(tf.square((true_conf - pred_conf) * noobj_mask))
    loss = loss_xy + loss_wh + loss_obj + loss_noobj
    
    tf.print('loss_xy', loss_xy)
    tf.print('loss_xy', loss_xy)
    tf.print('loss_obj', loss_obj)
    tf.print('loss_noobj', loss_noobj) 
    tf.print('loss', loss)

    return loss    

当我训练这个模型时。它显示

y_pred [0.2299328 0.604008436 0.498961449 1.22923946 0]
y_true [0 0 0 0 0]
loss_xy 0.00649014302
loss_xy 0.00649014302
loss_obj 0.00952909887
loss_noobj 0.000192464562
loss 0.0197582766
  1/334 [..............................] - ETA: 0s - loss: 0.0198 - accuracy: 0.2285
y_pred [-nan -nan -nan -nan -nan]
y_true [0 0 0 0 0]
loss_xy -nan
loss_xy -nan
loss_obj -nan
loss_noobj -nan
loss -nan
  2/334 [..............................] - ETA: 13:52 - loss: nan - accuracy: 0.6143
y_pred [-nan -nan -nan -nan -nan]
y_true [0 0 0 0 0]
loss_xy -nan
loss_xy -nan
loss_obj -nan
loss_noobj -nan
loss -nan

模型是

inputs = tf.keras.Input(shape=(256,256,3)) #inputs.shape is (None, 256, 256, 3)

x = tf.keras.layers.Conv2D(32, (16, 16), strides = (1, 1), padding='same')(inputs)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.MaxPooling2D((2, 2), strides = (2, 2))(x)

x = tf.keras.layers.Conv2D(64, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.MaxPooling2D((2, 2), strides = (2, 2))(x)

x = tf.keras.layers.Conv2D(128, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(64, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(128, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.MaxPooling2D((2, 2), strides = (2, 2))(x)

x = tf.keras.layers.Conv2D(256, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(128, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(256, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.MaxPooling2D((2, 2), strides = (2, 2))(x)

x = tf.keras.layers.Conv2D(512, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(256, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(512, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(256, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(512, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.MaxPooling2D((2, 2), strides = (2, 2))(x)

x = tf.keras.layers.Conv2D(1024, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(512, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(1024, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(512, (1, 1), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(256, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
x = tf.keras.layers.Conv2D(128, (3, 3), strides = (1, 1), padding = 'same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)

x = tf.keras.layers.AveragePooling2D((2,2), strides = (2,2))(x)

x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(1280, activation = 'relu')(x)
#x = tf.keras.layers.LeakyReLU(alpha = 0.1)(x)
#outputs = tf.keras.layers.Dense(10, activation = 'sigmoid')(x)
outputs = tf.keras.layers.Reshape((16,16,5))(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

0 个答案:

没有答案