当sess.run(summary_merge)时,Tensorflow挂起

时间:2019-06-07 09:45:27

标签: python ubuntu tensorflow hang kaggle

当我运行tensorflow.Session.run(tf.summary.merge_all())时,它将挂在那里。

我正在做kaggle,我想将经过预先训练的模型用作模型主干的骨干。因此,我导入了预训练模型的元图,并将新的操作节点添加到加载的图上,以适合目标数据集的注释形状。一切顺利,直到我尝试使用张量板可视化训练过程为止。它停留在sess.run(tf.summary.merge_all())。

尽管,我还没有尝试过这个solutin! ,似乎这种解决方案不是我的情况。

在这种情况下,如何根据此解决方案重写代码?请给我完整的代码。谢谢!

我的图形定义部分:

g1 = tf.Graph()
with g1.as_default():
    with tf.name_scope("global_step"):
        global_steps = tf.Variable(0, trainable=False)

    with tf.name_scope("x1"):
        x1 = tf.placeholder(dtype=tf.float32, shape=[None, 160, 160, 3], name='x1')

    with tf.name_scope("x2"):
        x2 = tf.placeholder(dtype=tf.float32, shape=[None, 160, 160, 3], name='x2')

    with tf.name_scope("y_label"):
        y = tf.placeholder(dtype=tf.float32, shape=[None, 1], name='y')

    with tf.variable_scope("model") as scope:
        with tf.name_scope("model_for_x1"):
            saver1 = tf.train.import_meta_graph('./model/model.meta')
        scope.reuse_variables()
        with tf.name_scope("model_for_x2"):
            saver2 = tf.train.import_meta_graph('./model/model.meta') 

    # extract the output from pre-trained model
        .
        .
        .
        .
        .(omit some codes)
        .
        .
        .
        .
    # final output
    with tf.name_scope("final_output"):
        final_output = tf.identity(FC2, 'final_output')

    # loss function
    with tf.name_scope("Loss"):
        loss_per = tf.nn.sigmoid_cross_entropy_with_logits(
            labels=y,
            logits=final_output,
            name='loss_per')
        loss = tf.reduce_mean(loss_per, name='loss_average')

    # Optimizer
    with tf.name_scope("Optimizer"):
        optimizer = tf.train.AdamOptimizer(0.00001, name='Adam2')  # already has a adam optimizer, so rename it
        train_step = optimizer.minimize(loss, global_step=global_steps)

    # visualization
    with tf.name_scope("Summary"):
        loss_summary = tf.summary.scalar("loss", loss)
        merged_summary = tf.summary.merge_all()

我的训练部分:

with tf.Session(graph=g1) as sess:
    train_writer = tf.summary.FileWriter("./logs" + "/train", sess.graph)
    val_writer = tf.summary.FileWriter("./logs" + "/val")
    tf.global_variables_initializer().run()  # initialize the weights
    saver1.restore(sess, './model/model')  # cover the weights by pre-trained model
    saver2.restore(sess, './model/model')  # cover the weights by pre-trained model
    print("*****************Start Training!!!******************")
    for epochNum in range(epochs):
        valid_los, summary = sess.run([loss, loss_summary],  <<------ hang here, if I threw loss_summary away, it will run fluently.
                                      feed_dict={
                                        x1: valid_batch_data[0],
                                        x2: valid_batch_data[1],
                                        y: valid_annotation})
        val_writer.add_summary(summary, global_steps)
        for iterNum in range(len(train)//batch_size):
            test_batch_data, test_annotation = next(gen(train, train_person_to_images_map, batch_size, (160, 160)))
            print('**********************************')
            train_los, _, summary = sess.run([loss, train_step, merged_summary],
                                             feed_dict={
                                                x1: test_batch_data[0],
                                                x2: test_batch_data[1],
                                                y: np.reshape(test_annotation, (batch_size, 1))})
            train_writer.add_summary(summary, global_steps)
            print('epoch: %d, iteration: %d, train_loss_per_iter: %f, valid_loss_per_epoch: %f' % (epochNum + 1, iterNum + 1, valid_los))
    train_writer.close()
    val_writer.close()

没有任何错误消息,而只是挂在那里!

0 个答案:

没有答案