相同的模型在Keras和Tensorflow中产生一致的不同精度

时间:2018-12-20 19:15:15

标签: python tensorflow keras neural-network

我正在尝试使用自定义数据在Keras和Tensorflow中使用Keras层实现相同的模型。在许多次训练中,这两种模型产生的精度始终是不同的(keras〜71%,张量流〜65%)。我希望tensorflow和keras一样好,所以我可以进行tensorflow迭代来调整一些较低级别的算法。

这是我原来的Keras代码:

from keras.layers import Dense, Dropout, Input 
from keras.models import Model, Sequential
from keras import backend as K

input_size = 2000
num_classes = 4
num_industries = 22
num_aux_inputs = 3

main_input = Input(shape=(input_size,),name='text_vectors')
x = Dense(units=64, activation='relu', name = 'dense1')(main_input)
drop1 = Dropout(0.2,name='dropout1')(x)

auxiliary_input = Input(shape=(num_aux_inputs,), name='aux_input')
x = keras.layers.concatenate([drop1,auxiliary_input])
x = Dense(units=64, activation='relu',name='dense2')(x)
drop2 = Dropout(0.1,name='dropout2')(x)

x = Dense(units=32, activation='relu',name='dense3')(drop2)

main_output = Dense(units=num_classes, 
activation='softmax',name='main_output')(x)

model = Model(inputs=[main_input, auxiliary_input], 
outputs=main_output)

model.compile(loss=keras.losses.categorical_crossentropy, metrics= ['accuracy'],optimizer=keras.optimizers.Adadelta())

history = model.fit([train_x,train_x_auxiliary], train_y, batch_size=128, epochs=20, verbose=1, validation_data=([val_x,val_x_auxiliary], val_y))
loss, accuracy = model.evaluate([val_x,val_x_auxiliary], val_y, verbose=0)

在这里,我在this article之后将keras图层移至tensorflow:

import tensorflow as tf
from keras import backend as K
import keras
from keras.layers import Dense, Dropout, Input # Dense layers are "fully connected" layers
from keras.metrics import categorical_accuracy as accuracy
from keras.objectives import categorical_crossentropy


tf.reset_default_graph()

sess = tf.Session()
K.set_session(sess)

input_size = 2000
num_classes = 4
num_industries = 22
num_aux_inputs = 3

x = tf.placeholder(tf.float32, shape=[None, input_size], name='X')
x_aux = tf.placeholder(tf.float32, shape=[None, num_aux_inputs], name='X_aux')
y = tf.placeholder(tf.float32, shape=[None, num_classes], name='Y')

# build graph
layer = Dense(units=64, activation='relu', name = 'dense1')(x)
drop1 = Dropout(0.2,name='dropout1')(layer)
layer = keras.layers.concatenate([drop1,x_aux])
layer = Dense(units=64, activation='relu',name='dense2')(layer)
drop2 = Dropout(0.1,name='dropout2')(layer)
layer = Dense(units=32, activation='relu',name='dense3')(drop2)
output_logits = Dense(units=num_classes, activation='softmax',name='main_output')(layer)

loss = tf.reduce_mean(categorical_crossentropy(y, output_logits))
acc_value = tf.reduce_mean(accuracy(y, output_logits))

correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name='correct_pred')

optimizer = tf.train.AdadeltaOptimizer(learning_rate=1.0, rho=0.95,epsilon=tf.keras.backend.epsilon()).minimize(loss)

init = tf.global_variables_initializer()

sess.run(init)

epochs = 20             # Total number of training epochs
batch_size = 128        # Training batch size
display_freq = 300      # Frequency of displaying the training results
num_tr_iter = int(len(y_train) / batch_size)

with sess.as_default():

    for epoch in range(epochs):
        print('Training epoch: {}'.format(epoch + 1))
        # Randomly shuffle the training data at the beginning of each epoch 
        x_train, x_train_aux, y_train = randomize(x_train, x_train_auxiliary, y_train)

        for iteration in range(num_tr_iter):
            start = iteration * batch_size
            end = (iteration + 1) * batch_size
            x_batch, x_aux_batch, y_batch = get_next_batch(x_train, x_train_aux, y_train, start, end)

            # Run optimization op (backprop)
            feed_dict_batch = {x: x_batch, x_aux:x_aux_batch, y: y_batch,K.learning_phase(): 1}

            optimizer.run(feed_dict=feed_dict_batch)

我还从零开始在tensorflow中实现了整个模型,但是它的准确性也达到了约65%,因此我决定尝试使用这种Keras-layers-of-TF设置来发现问题。

我查找了有关Keras和Tensorflow的类似问题的帖子,并尝试了以下对我的情况没有帮助的问题:

  1. Keras的辍学层仅在训练阶段处于活动状态,因此我通过设置keras.backend.learning_phase()在tf代码中进行了同样的操作。

  2. Keras和Tensorflow具有不同的变量初始化。我尝试通过以下三种方式在tensorflow中初始化我的权重,这应该与Keras的权重初始化相同,但它们也没有影响精度:

    initer = tf.glorot_uniform_initializer() 
    initer = tf.contrib.layers.xavier_initializer() 
    initer = tf.random_normal(shape) * (np.sqrt(2.0/(shape[0] + shape[1])))
    
  3. 两个版本中的优化器设置为完全相同!尽管看起来精度不取决于优化程序-我尝试在keras和tf中使用不同的优化程序,并且每个精度都收敛到相同。

帮助!

2 个答案:

答案 0 :(得分:0)

在我看来,这很可能是重量初始化问题。我建议您要做的是初始化keras层,然后在训练之前获取层权重并使用这些值初始化tf层。

我遇到了这样的问题,它为我解决了问题,但是很久以前,我不知道他们是否使这些初始值设定项相同。那时,tf和keras的初始化显然不相同。

答案 1 :(得分:-1)

我检查了初始化程序,种子,参数和超参数,但准确性有所不同。

我检查了Keras的代码,它们随机地对这批图像进行混洗,然后馈入网络,因此这种混洗在不同引擎之间是不同的。因此,我们需要找出一种方法,可以将相同的批处理图像集馈送到网络,以获得相同的准确性