使用TENSORFLOW深度学习:保存和加载模型的问题

时间:2018-06-05 10:12:32

标签: python tensorflow machine-learning keras deep-learning

情境
我正在建立一个用tensorfow进行图像识别的模型。事实上,我正在尝试保存我的模型,然后将其恢复以进行预测。

程序
我建立了一个具有这种结构的CNN

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, 204, 204, 32)      896       
_________________________________________________________________
dropout_3 (Dropout)          (None, 204, 204, 32)      0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 204, 204, 32)      9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 102, 102, 32)      0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 332928)            0         
_________________________________________________________________
dense_3 (Dense)              (None, 512)               170459648 
_________________________________________________________________
dropout_4 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 39)                20007     
_________________________________________________________________
activation_1 (Activation)    (None, 39)                0         
=================================================================
Total params: 170,489,799
Trainable params: 170,489,799
Non-trainable params: 0   

我将图表和变量保存在保护程序中。我有目的地命名我的占位符 X Y ,并命名预测(这是一个操作):输出。命名它们的目的是能够恢复它们并将它们用于预测。

我做了什么

  1. 为了构建和训练我的模型,我首先定义了我的占位符并构建了计算图。
  2. 然后,我定义了一个保存图表和变量的保护程序。
  3. 之后,我训练了我的网络(我使用了1个纪元,因为我想尝试保存和恢复工作正常)。然后我将会话变量值保存在保护程序中并将保存程序导出到一个文件夹中,以便我以后可以恢复它。在关闭tensorflow会话之前。我在第一个时期结束时计算了预测,我想将它与恢复的会话的结果进行比较。
  4. 我恢复/重新加载图表&变量(我在上一次操作中保存的)。
  5. 我恢复了已保存的占位符( X Y )和操作" 输出"为了计算预测,使用在第一个训练时期结束时使用的相同数据。 "目的是确保保存和恢复过程正常工作。所以,当我计算出预测时,我得到了一个值?我不明白我的方法有什么问题。
  6. 以下是我使用的代码

    定义&设计模型结构

    def create_model(X,Y):
        # Convolutional Layer #1
        conv1 = tf.layers.conv2d(
          inputs=X,
          filters=32,
          kernel_size=[3, 3],
          padding="same",
          activation=tf.nn.relu)
        print('conv1 OUTPUT shape: ',conv1.shape)
    
        # Dropout layer #1
        dropout1 = tf.layers.dropout(
          inputs=conv1, rate=0.2, training='TRAIN' == tf.estimator.ModeKeys.TRAIN)
        print('drop1 OUTPUT shape: ',dropout1.shape)
    
        # Convolutional Layer #2
        conv2 = tf.layers.conv2d(
          inputs=dropout1,
          filters=32,
          kernel_size=[3, 3],
          padding="same",
          activation=tf.nn.relu)
        print('conv2 OUTPUT shape: ',conv2.shape)
    
        # Pooling Layer #2
        pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2],strides=2)
        print('pool2 OUTPUT shape: ',pool2.shape)
        pool2_flat = tf.reshape(pool2, [-1, pool2.shape[1]*pool2.shape[2]*pool2.shape[3]])
    
        # Dense layer #3
        dense3 = tf.layers.dense(inputs=pool2_flat, units=512, activation=tf.nn.relu)
        print('dense3 OUTPUT shape: ',dense3.shape)
    
        # Dropout layer #3
        dropout3 = tf.layers.dropout(
          inputs=dense3, rate=0.5, training='TRAIN' == tf.estimator.ModeKeys.TRAIN)
        print('drop3 OUTPUT shape: ',dropout3.shape)
    
        # Dense layer #4
        Z = tf.layers.dense(inputs=dropout3, units=39, activation=tf.nn.sigmoid)
        print('dense4 OUTPUT shape: ',Z.shape)
    
        #Threshold
        #Z1 = tf.to_int32(Z > 0.5)
        #print('OUTPUT: shape output after threshold ',Z1.shape)
    
        # Calculating cost
        cost= tf.reduce_mean(Y * - tf.log(tf.clip_by_value(Z,1e-10,1.0)) + (1 - Y) * - tf.log(tf.clip_by_value(1 - Z,1e-10,1.0)),axis=0)
        print('cost: shape of cost: ',cost.shape)
        cost= tf.reshape(cost, [1, 39])
        print('cost reshaped: shape of cost reshaped: ',cost.shape)
        #Optimizer
        optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
        #Metric paramaters 
        predicted_positive = predicted_positives(Y,Z)
        true_and_possible_positive = true_and_possible_positives(Y,Z)
        #naming operations
        output1=tf.multiply(Z,1,name='the_output_for_prediction')
        cost1=tf.multiply(cost,1,name='cost')
        return output1,predicted_positive,true_and_possible_positive,optimizer,cost1
    

    创建模型&培训与培训拯救和评估第一纪元后的预测(工作正常)

    # For a given (X_train,Y_train)&(X_valid,Y_train) and some hyperparam( some cannot be changed: it trains a model, saves it and returns the metric values.
    
    #def train_model(X_train,Y_train,X_valid,Y_valid,num_epochs,minibatch_size=32):
    #Reset tensorflow graph
    ops.reset_default_graph()
    
    # hyperparameters
    num_epochs=1
    minibatch_size=32
    
    m=X_train.shape[0]
    
    #Initialize metric variables
    costs = []
    precision=[]
    recall=[]
    costs_v = []
    precision_v=[]
    recall_v=[]
    
    m=X_train.shape[0]
    
    
    
    #Placeholders
    X = tf.placeholder(tf.float32, [None, X_train.shape[1],X_train.shape[2],X_train.shape[3]],name="X")
    Y = tf.placeholder(tf.float32, [None, Y_train.shape[1]],name="Y")
    
    #Graph construction
    output,predicted_positives,true_and_possible_positives,optimizer,cost=create_model(X,Y)
    
    # Initializer for all the variables globally
    init = tf.global_variables_initializer()
    
    #Creating tf session and running it (i.e initialize the global variables) 
    sess=tf.Session()
    sess.run(init)
    
    # Create model saver
    #model_saver = tf.train.Saver()
    saver = tf.train.Saver()
    
    
    
    
    #Training algorithm
    #Running epochs
    for epoch in range(num_epochs):
    
                ########################################### training   ################################################
                print('training epoch number :',epoch)
    
                #initialize parameters of metrics 
                predicted_po=0
                true_po=0
                possible_po=0
    
                #initialize cost param and mini_batches
                minibatch_cost = 0.
                num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set
                minibatches = random_mini_batches(X_train, Y_train, minibatch_size)
                #Running batches
                for minibatch in minibatches:
    
                    #minibatch extraction
                    minibatch_X=X_train[minibatch,]
                    minibatch_Y=Y_train[minibatch,]
    
                    # IMPORTANT: The line that runs the graph on a minibatch.
                    # Run the session to execute the optimizer, cost and metric parameters , the feedict should contain a minibatch for (X,Y).
                    _ , temp_cost = sess.run([optimizer, cost], feed_dict={X:minibatch_X, Y:minibatch_Y})
                    predicted_p,true_p_p = sess.run([predicted_positives, true_and_possible_positives], feed_dict={X:minibatch_X, Y:minibatch_Y})
    
                    #calculate cost,predicted_true_possible positives per batch.
                    minibatch_cost += temp_cost / num_minibatches
                    predicted_po+=predicted_p
                    true_po+=true_p_p[0]
                    possible_po+=true_p_p[1]
    
                    #print batch param metric
                        #print('predicted_p per batch ',noi,'  ',predicted_p)
                        #print('true_p per bartch ',noi,'  ',true_p_p[0])
                        #print('possible_p per batch ',noi,'  ',true_p_p[1])
    
    
                '''
                # Print the cost every epoch
                #    print ("Cost after epoch %i: %f" % (epoch, minibatch_cost))
                print('precision_on_training epoch: ',epoch,'  ',true_po/predicted_po)
                print('recall on training epoch ',epoch,'  ', true_po/possible_po)
                '''
    
    
                #appending tables of costs and metrics on training
                costs.append(minibatch_cost)
                precision.append(true_po/predicted_po)
                recall.append(true_po/possible_po)
    
                minibatches=None
                minibatch_X=None
                minibatch_Y=None
                ###################################### Validation  ###################################################
    
                print('validation epoch number :',epoch)
    
                #create minibatches for validation
                #initialize parameters of metrics 
    
                m_v=X_valid.shape[0]
                minibatch_cost_v = 0.
                num_minibatches_v = int(m_v / minibatch_size) # number of minibatches validation of size minibatch_size_v in the validation set
                predicted_po_v=0
                true_po_v=0
                possible_po_v=0
                minibatches_valid=random_mini_batches(X_valid, Y_valid, minibatch_size)
    
    
    
                #running batches
                for minibatch in minibatches_valid:
                    #print('batch validation number ', noi_v, ' / ',num_minibatches_v)
                    minibatch_X_valid=X_train[minibatch,]
                    minibatch_Y_valid=Y_train[minibatch,]
                    temp_cost_v = sess.run(cost, feed_dict={X:minibatch_X_valid, Y:minibatch_Y_valid})
                    predicted_p_v,true_p_p_v = sess.run([predicted_positives, true_and_possible_positives], feed_dict={X:minibatch_X_valid, Y:minibatch_Y_valid})
    
                    minibatch_cost_v += temp_cost_v / num_minibatches_v
                    predicted_po_v+=predicted_p_v
                    true_po_v+=true_p_p_v[0]
                    possible_po_v+=true_p_p_v[1]
    
                '''
                #Printing precision and recall on validation
                print('precision_on_validation epoch: ',epoch,'  ',true_po_v/predicted_po_v)
                print('recall on validation epoch ',epoch,'  ', true_po_v/possible_po_v)    
                '''
    
                #appending tables of costs and metrics on training
                costs_v.append(minibatch_cost_v)
                precision_v.append(true_po_v/predicted_po_v)
                recall_v.append(true_po_v/possible_po_v)    
                minibatches_valid=None   
                minibatch_X_valid=None
                minibatch_Y_valid=None
    
    
    #A simple test to check whether a restored model has exactly the same values (w and graph) as the original. Testing on a new output.
    prediction_first_epoch=sess.run([output], feed_dict={X:X_train[[0]], Y:Y_train[[0]]})
    print(np.mean(prediction_first_epoch))
    
    
    #Save model            
    #global_step: used for checkpoints 
    saver.save(sess, 'ok/',global_step=1000)
    
    
    ##Save different training parameters.
    costs_array=np.asarray(costs)
    precision_array=np.asarray(precision)
    recall_array=np.asarray(recall)
    costs_v_array=np.asarray(costs_v)
    precision_v_array=np.asarray(precision_v)
    recall_v_array=np.asarray(recall_v)
    np.save('costs_array', costs_array)
    np.save('precision_array', precision_array)
    np.save('recall_array', recall_array)
    np.save('costs_v_array', costs_v_array)
    np.save('precision_v_array', precision_v_array)
    np.save('recall_v_array', recall_v_array)
    
    
    
    
    
    sess.close()
    

    恢复已保存的模型并尝试使用与之前相同的数据进行预测

    ops.reset_default_graph()
    sess=tf.Session()   
    #Load meta_graph and restore weights
    saver = tf.train.import_meta_graph('ok/-1000.meta')
    saver.restore(sess,tf.train.latest_checkpoint('ok/'))
    #put graph in variable and load adequate tensors & operations
    graph = tf.get_default_graph()
    X = graph.get_tensor_by_name("X:0")
    Y = graph.get_tensor_by_name("Y:0")
    output=graph.get_operation_by_name("the_output_for_prediction")
    output_vis= sess.run(output, feed_dict={X:X_train[[0]], Y:Y_train[[0]]})
    
    print(output_vis)
    sess.close()
    

    结果:

    INFO:tensorflow:Restoring parameters from ok/-1000
    None
    

    问题

    即使我在默认图表中恢复了已保存的图表并提取了两个占位符和预测操作,我也不明白为什么我会获得无值。

1 个答案:

答案 0 :(得分:0)

[问题解决]
我在调用这个操作,这就是为什么我得到一个无价值的原因。 相反,我需要调用与该操作相关联的张量并用输入来提供它。 所以不要这样做:

output=graph.get_operation_by_name("the_output_for_prediction")

我应该这样做:

output=graph.get_tensor_by_name("the_output_for_prediction:0")

PS:":0"是如此重要。如果你写" the_output_for_prediction" ,这不会奏效!