我正在尝试训练一个CNN,该CNN将输出面部特征的(x,y)坐标,例如左眼起始点,左眼末端,左眼中心,右眼起始点,右眼中心和右眼末端。根据Coursera课程的深度学习专业知识,我需要为每个点获取0和1值(正确或错误),然后为每个点获取x和y坐标。如果值为0(否),则在训练期间我需要忽略x和y的输出。
我试图寻找一种方法来执行此操作,但是没有找到。最后,我想到将所有不存在的数据(我们没有特定功能的数据)的x,y坐标设置为0,并训练以下网络。
def get_model():
inputs = Input(shape=(96, 96, 1))
# a layer instance is callable on a tensor, and returns a tensor
x = Conv2D(16, kernel_size=5, padding='same', activation='relu')(inputs)
x = Conv2D(32, kernel_size=5, padding='valid', activation='relu')(x)
x = Dropout(0.25)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = BatchNormalization()(x)
x = Conv2D(64, kernel_size=5, padding='valid', activation='relu')(x)
x = Conv2D(128, kernel_size=5, padding='valid', activation='relu')(x)
x = Dropout(0.25)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = BatchNormalization()(x)
x = Conv2D(256, kernel_size=3, padding='valid', activation='relu')(x)
x = Dropout(0.25)(x)
x = MaxPooling2D(pool_size=(1, 1))(x)
x = Flatten()(x)
present = Dense(256)(x)
present = PReLU()(present)
present = Dense(15, activation="sigmoid", name="classification")(present)
position = Dense(256)(x)
position = PReLU()(position)
position = BatchNormalization()(position)
position = Dense(128)(position)
position = PReLU()(position)
position = BatchNormalization()(position)
position = Dense(64)(position)
position = PReLU()(position)
position = Dense(30, activation="relu", name="position")(position)
# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=[present, position])
model.compile(optimizer='adam',
loss={'classification':'binary_crossentropy','position':'mse'},
metrics=['accuracy'])
return model
尽管分类精度很好,但是位置精度却只有1%。
答案 0 :(得分:0)
您可以创建自定义损失,以使坐标损失计算中的置信度值低于阈值的点被忽略,而置信度损失计算则被忽略:
def custom_loss(y_true, y_pred, conf_dim=-1, conf_thresh=0, px_weight=0.5, conf_weight=0.5):
# compute sum of squared coordinate errors
px_err = K.sum(K.square(y_pred[:, :conf_dim] - y_true[:, :conf_dim]), axis=-1)
# compute squared confidence errors
conf_err = K.square(y_pred[:, conf_dim] - y_true[:, conf_dim])
# set loss of points whose confidences are below our threshold to zero
px_err = px_err * tf.cast(y_true[:, conf_dim] > conf_thresh, y_true.dtype)
# calculate mean over errors
px_err, conf_err = K.mean(px_err), K.mean(conf_err)
# return sum of weighted pixel and confidence errors
return px_weight * px_err + conf_weight * conf_err
# compute a small example with numpy arrays
# -------
# x, y, c
# -------
y_true = np.array([[1, 2, 1],
[3, 4, 0], # confidence zero
[5, 6, 1]])
y_pred = np.array([[1, 2, 1], # correct
[1, 2, 1], # wrong (not included in pixel loss because of confidence)
[2, 3, 1]]) # wrong
# run through loss function
print("Loss:", K.eval(custom_loss(K.variable(y_true), K.variable(y_pred))))
>>> Loss: 3.1666667