Keras,不同的精度和改组和未改组的测试数据的召回率

时间:2018-08-04 07:55:52

标签: python tensorflow machine-learning keras

我正在编写一个模型,该模型具有基于某些值的二进制分类的keras,在我的测试/训练集中,我将正例和负例分离开来,为了训练,我将它们混在一起。在测试集上,我希望我不需要重新整理数据,因为顺序应该没有差异。尽管没有重新组合测试数据的模型的精度确实会降低,但是召回率和精度仍然很低。另一方面,当我重新整理测试数据时,精度保持不变,但是查全率和精度具有更高的值。我离开了精度图,然后回想起波纹管,所以检查一下。

所以我的第一个问题是,为什么改组和未改组数据的精度和召回值之间有区别?

第二个问题,我应该相信哪个分数,或者应该以不同的方式衡量召回率和准确性?

下面的代码:

top = 34000
toptop=35000

x_neg = full_x_neg[:43000]
y_neg = np.zeros(len(x_neg))
x_pos = full_x_pos[:34000]
y_pos = np.ones(len(x_pos))

x_test = np.asarray(full_x_neg[43000:45000] + full_x_pos[top:toptop])
y_test = np.asarray(np.concatenate((np.zeros(len(full_x_neg[43000:45000])), np.ones(len(full_x_pos[top:toptop])))))
x_test = x_test.reshape((len(x_test), 10, 12))
x_test, y_test = unison_shuffled_copies(x_test, y_test) #shuffling test

x, y = unison_shuffled_copies(x, y)
x = x.reshape((len(x), 10, 12))
batch_size = 64
print('Build model...')
model = Sequential()
model.add(LSTM(128, dropout=0.2,input_shape=(10, 12)))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
          optimizer='adam',
          metrics=[precision, recall, fscore])
print('Training...')
history = model.fit(x, y,validation_data=(x_test, y_test),
      batch_size=batch_size,
      epochs=15)
score, prec, rec, fscore = model.evaluate(x_test, y_test, batch_size=batch_size)

调用功能:

def recall(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

精度函数:

def precision(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision

Recall with shuffled test data

Precision with shuffled test data

Recall withoud suffling test data

Precision without suffilng test data

0 个答案:

没有答案