Tensorflow:简单的3D Convnet无法学习

时间:2018-07-11 16:51:48

标签: python tensorflow image-processing deep-learning

我正在尝试创建一个用于图像分割的简单3D U-net ,只是为了学习如何使用图层。因此,我使用步幅2进行了3D卷积,然后进行了转置反卷积以恢复相同的图像大小。我也正适合一小套(测试套),以查看我的网络是否正在学习。

我在Keras中创建了相同的网络,效果很好。现在我想在tensorflow中创建,但是遇到了麻烦。

成本略有变化,但是无论我做什么(降低学习率,添加更多纪元,添加更多层,更改批次大小...),输出始终是相同的。我相信网不会更新权重。我确定我做错了什么,但是我可以找到它是什么。任何帮助将不胜感激。

这是我的代码

def forward_propagation(X):

    if ( mode == 'train'): print(" --------- Net --------- ")

    # Convolutional Layer 1
    with tf.variable_scope('CONV1'):
        Z1 = tf.layers.conv3d(X, filters = 16, kernel =[3,3,3], strides = [ 2, 2, 2], padding='SAME', name = 'S2/conv3d')
        A1 = tf.nn.relu(Z1, name = 'S2/ReLU')
        if ( mode == 'train'): print("Convolutional Layer 1 S2 " + str(A1.get_shape()))

    # DEConvolutional Layer 1
    with tf.variable_scope('DeCONV1'):
        output_deconv1 = tf.stack([X.get_shape()[0] , X.get_shape()[1], X.get_shape()[2], X.get_shape()[3], 1])
        dZ1 = tf.nn.conv3d_transpose(A1,  filters = 1, kernel =[3,3,3], strides = [2, 2, 2], padding='SAME', name = 'S2/conv3d_transpose')
        dA1 = tf.nn.relu(dZ1, name = 'S2/ReLU')

        if ( mode == 'train'): print("Deconvolutional Layer 1 S1 " + str(dA1.get_shape()))

    return dA1


def compute_cost(output, target, method = 'dice_hard_coe'):

    with tf.variable_scope('COST'):       

        if (method == 'sigmoid_cross_entropy') :
            # Make them vectors
            output = tf.reshape( output, [-1, output.get_shape().as_list()[0]] )
            target = tf.reshape( target, [-1, target.get_shape().as_list()[0]] )
            loss = tf.nn.sigmoid_cross_entropy_with_logits(logits = output, labels = target)
            cost = tf.reduce_mean(loss)

    return cost

模型的主要功能

def model(X_h5, Y_h5, learning_rate = 0.009,
          num_epochs = 100, minibatch_size = 64, print_cost = True):


    ops.reset_default_graph()                         # to be able to rerun the model without overwriting tf variables
    #tf.set_random_seed(1)                             # to keep results consistent (tensorflow seed)
    #seed = 3                                          # to keep results consistent (numpy seed)
    (m, n_D, n_H, n_W, num_channels) = X_h5["test_data"].shape   #TTT          
    num_labels = Y_h5["test_mask"].shape[4] #TTT
    img_size = Y_h5["test_mask"].shape[1]  #TTT
    costs = []                                        # To keep track of the cost
    accuracies = []                                   # To keep track of the accuracy



    # Create Placeholders of the correct shape
    X, Y = create_placeholders(n_H, n_W, n_D, minibatch_size)

    # Forward propagation: Build the forward propagation in the tensorflow graph
    nn_output = forward_propagation(X)
    prediction = tf.nn.sigmoid(nn_output)

    # Cost function: Add cost function to tensorflow graph
    cost_method = 'sigmoid_cross_entropy' 
    cost = compute_cost(nn_output, Y, cost_method)

    # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer that minimizes the cost.
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

    # Initialize all the variables globally
    init = tf.global_variables_initializer()


    # Start the session to compute the tensorflow graph
    with tf.Session() as sess:

        print('------ Training ------')

        # Run the initialization
        tf.local_variables_initializer().run(session=sess)
        sess.run(init)

        # Do the training loop
        for i in range(num_epochs*m):
            # ----- TRAIN -------
            current_epoch = i//m            

            patient_start = i-(current_epoch * m)
            patient_end = patient_start + minibatch_size

            current_X_train = np.zeros((minibatch_size, n_D,  n_H, n_W,num_channels))
            current_X_train[:,:,:,:,:] = np.array(X_h5["test_data"][patient_start:patient_end,:,:,:,:]) #TTT
            current_X_train = np.nan_to_num(current_X_train) # make nan zero

            current_Y_train = np.zeros((minibatch_size, n_D, n_H, n_W, num_labels))
            current_Y_train[:,:,:,:,:] = np.array(Y_h5["test_mask"][patient_start:patient_end,:,:,:,:]) #TTT
            current_Y_train = np.nan_to_num(current_Y_train) # make nan zero

            feed_dict = {X: current_X_train, Y: current_Y_train}
            _ , temp_cost = sess.run([optimizer, cost], feed_dict=feed_dict)

            # ----- TEST -------
            # Print the cost every 1/5 epoch
            if ((i % (num_epochs*m/5) )== 0):              

                # Calculate the predictions
                test_predictions = np.zeros(Y_h5["test_mask"].shape)

                for j in range(0, X_h5["test_data"].shape[0], minibatch_size):

                    patient_start = j
                    patient_end = patient_start + minibatch_size

                    current_X_test = np.zeros((minibatch_size, n_D,  n_H, n_W, num_channels))
                    current_X_test[:,:,:,:,:] = np.array(X_h5["test_data"][patient_start:patient_end,:,:,:,:])
                    current_X_test = np.nan_to_num(current_X_test) # make nan zero

                    current_Y_test = np.zeros((minibatch_size, n_D, n_H, n_W, num_labels))
                    current_Y_test[:,:,:,:,:] = np.array(Y_h5["test_mask"][patient_start:patient_end,:,:,:,:]) 
                    current_Y_test = np.nan_to_num(current_Y_test) # make nan zero

                    feed_dict = {X: current_X_test, Y: current_Y_test}
                    _, current_prediction = sess.run([cost, prediction], feed_dict=feed_dict)
                    test_predictions[j:j + minibatch_size,:,:,:,:] = current_prediction

                costs.append(temp_cost)
                print ("[" + str(current_epoch) + "|" + str(num_epochs) + "] " + "Cost : " + str(costs[-1]))
                display_progress(X_h5["test_data"], Y_h5["test_mask"], test_predictions, 5, n_H, n_W)

        # plot the cost
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('epochs')
        plt.show()

        return  

我通过以下方式调用模型:

model(hdf5_data_file, hdf5_mask_file, num_epochs = 500, minibatch_size = 1, learning_rate = 1e-3)

这些是我目前得到的结果: enter image description here enter image description here

修改: 我曾尝试降低学习速度,但无济于事。我还尝试使用张量板调试,并且权重未更新:

我不确定为什么会这样。 我在keras中创建了相同的简单模型,并且效果很好。我不确定我在tensorflow中做错了什么。

1 个答案:

答案 0 :(得分:0)

不确定您是否仍在寻求帮助,因为我在发布日期的半年后回答了这个问题。 :)我已经列出了我的观察结果以及一些建议,供您在下面尝试。我的主要观察是对的...那么您可能只需要喝咖啡休息一下/睡个好觉。

主要观察结果:

  • tf.reshape( output, [-1, output.get_shape().as_list()[0]] )似乎是错误的。如果您想使向量变平,则应该类似于tf.reshape(output,[-1,np.prod(image_shape_list)])

其他观察结果

  • 对于这样一个浅层网络,我怀疑该网络是否具有足够的空间分辨率,以区分肿瘤体素和非肿瘤体素。与纯tf实现相比,您能否显示keras实现和性能?我可能会使用2层以上的层,让我们来。 假设有3层,每层跨度为2,输入图像宽度为256,则在最深的编码器层的宽度为32。 (如果您的GPU内存有限,请对输入图像进行下采样。)
  • 如果更改损耗计算无效,如@bremen_matt所述,请将LR减小为1e-5。
  • 在基本架构进行了调整之后,您“觉得”网络是一种学习过程,并且没有被卡住,请尝试扩充训练数据,在训练过程中添加退出,批处理规范,然后也许通过添加鉴别器来弥补损失。