Question

# We have 2 inputs, 1 for each picture
left_input = Input(img_size)
right_input = Input(img_size)

# We will use 2 instances of 1 network for this task
convnet = MobileNetV2(weights='imagenet', include_top=False, input_shape=img_size,input_tensor=None)
convnet.trainable=True
x=convnet.output
x=tf.keras.layers.GlobalAveragePooling2D()(x)
x=Dense(320,activation='relu')(x)
x=Dropout(0.2)(x)
preds = Dense(101, activation='sigmoid')(x) # Apply sigmoid
convnet = Model(inputs=convnet.input, outputs=preds)

# Connect each 'leg' of the network to each input
# Remember, they have the same weights
encoded_l = convnet(left_input)
encoded_r = convnet(right_input)


# Getting the L1 Distance between the 2 encodings
L1_layer = Lambda(lambda tensor:K.abs(tensor[0] - tensor[1]))

# Add the distance function to the network
L1_distance = L1_layer([encoded_l, encoded_r])

prediction = Dense(1,activation='sigmoid')(L1_distance)
siamese_net = Model(inputs=[left_input,right_input],outputs=prediction)

optimizer = Adam(lr, decay=2.5e-4)
#//TODO: get layerwise learning rates and momentum annealing scheme described in paperworking
siamese_net.compile(loss=keras.losses.binary_crossentropy,optimizer=optimizer,metrics=['accuracy'])

siamese_net.summary()

培训结果如下

史诗1/10 126/126 [==============================]-169s 1s / step-损耗：0.5683-精度：0.6840-val_loss ：0.4644-val_accuracy：0.8044 时代2/10 126/126 [==============================]-163s 1s / step-损耗：0.2032-精度：0.9995-val_loss ：0.2117-val_accuracy：0.9681 时代3/10 126/126 [==============================]-163s 1s / step-损耗：0.1110-精度：0.9925-val_loss ：0.1448-val_accuracy：0.9840 时代4/10 126/126 [==============================]-164s 1s / step-损耗：0.0844-精度：0.9950-val_loss ：0.1384-val_accuracy：0.9820 时代5/10 126/126 [==============================]-163s 1s / step-损耗：0.0634-精度：0.9990-val_loss ：0.0829-val_accuracy：1.0000 时代6/10 126/126 [=============================]-165秒1秒/步-损失：0.0526-准确性：0.9995-val_loss ：0.0729-val_accuracy：1.0000 时代7/10 126/126 [==============================]-164s 1s / step-损耗：0.0465-精度：0.9995-val_loss ：0.0641-val_accuracy：1.0000 时代8/10 126/126 [==============================]-163s 1s / step-损耗：0.0463-精度：0.9985-val_loss ：0.0595-val_accuracy：1.0000

当我比较两个不同的图像时，该模型预测的准确性很高。此外，它预测使用同一类图像确实非常好。但是，当我将Image1与image1本身进行比较时，可以预测它们仅以0.5的概率相似。在其他情况下，如果将image1与image2进行比较，则它以0.8的概率正确预测。（此处image1和image2属于同一类）

当我比较单个图像时，它是正确的预测，我尝试了其他替代方法而不进行锻炼。我可以知道可能是什么错误吗？

Answer 1

两个相等向量之间的L1距离始终为零。

当您传递相同的图像时，生成的编码是相等的（encoded_l等于encoded_r）。因此，最后一个乙状结肠层的输入是零向量。

还有sigmoid(0) = 0.5。

这是为模型提供相同输入而将0.5作为输出的原因。

关于暹罗CNN的准确性

1 个答案: