Question

我想对某些识别问题应用三元组损失。我这样做是通过制作三元组并将它们存储在5D数组中来实现的。 X = [total_samples，triplet_marking，channels，height，width] 。尽管这里不需要地面真相标签，因为Triplet_markings（0：锚点，1：正和2：负）已经足够了，但适合我需要的keras。地面真相标签在2D数组 Y = [total_samples，Triplet_marking] 中定义。

首先，我最后使用函数定义了具有16个元素FC层的基本神经网络。然后，我将三重锚定pos-neg作为输入。然后定义了三重态损失，如吴安德鲁（Andrew Ng）在Coursera的课程中所定义。然后定义，编译和训练模型。

def network(input_shape):
    seq = models.Sequential()
    seq.add(layers.Conv2D(8, (3,3), (1,1), data_format="channels_first", activation="relu", kernel_initializer="glorot_uniform"))
    seq.add(layers.Conv2D(8, (3,3), (1,1), data_format="channels_first", activation="relu", kernel_initializer="glorot_uniform"))
    seq.add(layers.MaxPooling2D(pool_size=(2, 2), data_format="channels_first"))
    seq.add(Flatten())
    seq.add(Dense(16, activation='relu'))
    seq.add(BatchNormalization())
    return seq

img_anc = layers.Input(shape=(3,384,384))
img_pos = layers.Input(shape=(3,384,384))
img_neg = layers.Input(shape=(3,384,384))

net = network((3,384,384))
feature_anc = base_network(img_anc)
feature_pos = base_network(img_pos)
feature_neg = base_network(img_neg)

def triplet_loss(y_true, y_pred):
    a = y_pred[:,0]
    p = y_pred[:,1]
    n = y_pred[:,2]

    margin = 1

    pos_dist = np.sum(np.square(np.subtract(a,p)))
    neg_dist = np.sum(np.square(np.subtract(a,n)))
    basic_loss = np.subtract(pos_dist,neg_dist) + margin
    loss = np.max(basic_loss,0)

    return loss  


model_train = models.Model(input=[img_anc, img_pos, img_neg], output=[feature_anc, feature_pos, feature_neg])

model_train.compile(loss=triplet_loss, optimizer='adam')

img_a = x_train[:, 0] #using triplet_marking
img_p = x_train[:, 1]
img_n = x_train[:, 2]

# [img_a, img_p, img_c] vector is my input
# [y_train[:,0], y_train[:,1], y_train[:,2]] vector is my ground truth labels
# loss-func is not taking labels in account, I'm giving it just for the keras to work 

history = model_train.fit([img_a, img_p, img_n], [y_train[:,0], y_train[:,1], y_train[:,2]], batch_size=16, epochs= 100, verbose=2, validation_split=.25, shuffle=True)

当我提供3张图像作为输入时，我希望y_pred的形状为（batch_size，3,16），其中y_pred [：，0]将是锚点特征，y_pred [：，1]将为pos功能，而y_pred [：，2]将为负功能，但其形状为（batch_size，16）。也许这就是为什么我没有得到正确的结果。如果我在任何地方做错了，请纠正我。

将三重态损失应用于识别问题

0 个答案: