我试图从Keras example about siamese network更改代码。但奇怪的是,无论损失减少,精度始终为0.5000。我现在的假设是我错误地修改了create_pair
函数,我想尝试将类的数量改为4:
原件:
def create_pairs(x, digit_indices):
'''Positive and negative pair creation.
Alternates between positive and negative pairs.
'''
pairs = []
labels = []
n = min([len(digit_indices[d]) for d in range(10)]) - 1
for d in range(10):
for i in range(n):
z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 10)
dn = (d + inc) % 10
z1, z2 = digit_indices[d][i], digit_indices[dn][i]
pairs += [[x[z1], x[z2]]]
labels += [1, 0]
return np.array(pairs), np.array(labels)
和第93-97行:
digit_indices = [np.where(y_train == i)[0] for i in range(10)]
tr_pairs, tr_y = create_pairs(x_train, digit_indices)
digit_indices = [np.where(y_test == i)[0] for i in range(10)]
te_pairs, te_y = create_pairs(x_test, digit_indices)
这是我的代码:
def create_pairs(x, digit_indices):
'''Positive and negative pair creation.
Alternates between positive and negative pairs.
'''
pairs = []
labels = []
n = min([len(digit_indices[d]) for d in range(4)]) - 1
for d in range(4):
for i in range(n):
z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 4)
dn = (d + inc) % 4
z1, z2 = digit_indices[d][i], digit_indices[dn][i]
pairs += [[x[z1], x[z2]]]
labels += [1, 0]
return np.array(pairs), np.array(labels)
和第93-97行:
digit_indices = [np.where(y_train == i)[0] for i in range(4)]
tr_pairs, tr_y = create_pairs(x_train, digit_indices)
digit_indices = [np.where(y_test == i)[0] for i in range(4)]
te_pairs, te_y = create_pairs(x_test, digit_indices)
这是我的base_network(使用RNN的那个,而不是我在评论回复中谈过的转换网,两者都给出相同的结果,50%的准确度):
def create_base_network(embedding_layer):
seq = Sequential()
seq.add(embedding_layer)
seq.add(GRU(512, use_bias=True, dropout=0.5, recurrent_dropout=0.5, return_sequences=True))
seq.add(GRU(512, use_bias=True, dropout=0.5, recurrent_dropout=0.5))
seq.add(Dense(512, activation='relu'))
seq.add(Dropout(0.1))
seq.add(Dense(512, activation='relu'))
return seq
嵌入层只是一个简单的手套矩阵。并且我还在合并后使用sigmoid激活函数添加另一个密集层。
有什么遗漏?或者那不是我应该如何改变它?提前致谢
答案 0 :(得分:0)
连体代码错了,尚未修复。问题是在切换0和1时不对称的损失函数,但是keras代码假设它是。 改变这一行
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
到
return K.mean((1 - y_true) * K.square(y_pred) + y_true * K.square(K.maximum(margin - y_pred, 0)))
和
labels += [1, 0]
到
labels += [0, 1]