TensorFlow模型的成本始终为0

时间:2019-11-14 18:01:15

标签: python tensorflow deep-learning computer-vision artificial-intelligence

因此,我一直在通过此Computer Vision项目学习TensorFlow,但不确定我是否足够了解它。我认为我的会话部分正确,尽管图似乎是这里的问题。这是我的代码:

def model_train(placeholder_dimensions, filter_dimensions, strides, learning_rate, num_epochs, minibatch_size, print_cost = True):

    # for training purposes
    tf.reset_default_graph()

    # create datasets
    train_set, test_set = load_dataset() custom function and and custom made dataset
    X_train = np.array([ex[0] for ex in train_set])
    Y_train = np.array([ex[1] for ex in train_set])
    X_test = np.array([ex[0] for ex in test_set])
    Y_test = np.array([ex[1] for ex in test_set])

    #convert to one-hot encodings
    Y_train = tf.one_hot(Y_train, depth = 10)
    Y_test = tf.one_hot(Y_test, depth = 10)

    m = len(train_set)

    costs = []

    tf.reset_default_graph()
    graph = tf.get_default_graph()    
    with graph.as_default():    

        # create placeholders
        X, Y = create_placeholders(*placeholder_dimensions)

        # initialize parameters
        parameters = initialize_parameters(filter_dimensions)

        # forward propagate
        Z4 = forward_propagation(X, parameters, strides)

        # compute cost
        cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = Z4, labels = Y))

        # define optimizer for backpropagation that minimizes the cost function 
        optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

    # initialize variables    
    init = tf.global_variables_initializer()

    # start session
    with tf.Session() as sess:

        sess.run(init)

        for epoch in range(num_epochs):
            minibatch_cost = 0.
            num_minibatches = int(m / minibatch_size)

            # get random minibatch
            minibatches = random_minibatches(np.array([X_train, Y_train]), minibatch_size)

            for minibatch in minibatches:
                minibatch_X, minibatch_Y = minibatch
                _ , temp_cost = sess.run([optimizer, cost], {X: minibatch_X, Y: minibatch_Y})
                minibatch_cost += temp_cost / num_minibatches

            if print_cost == True and epoch % 5 == 0:
                print('Cost after epoch %i: %f' %(epoch, minibatch_cost))
            if print_cost == True:
                costs.append(minibatch_cost)

    # plot the costs
    plot_cost(costs, learning_rate)

    # calculate correct predictions 
    prediction = tf.argmax(Z4, 1)
    correct_prediction = tf.equal(prediction, tf.argmax(Y, 1))

    # calculate accuracy on test set
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
    train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
    test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
    print('Training set accuracy:', train_accuracy)
    print('Test set accuracy:', test_accuracy)

    return parameters

其中create_placeholder和initialize_parameters函数如下:

def initialize_parameters(filter_dimensions):

    # initialize weight parameters for convolution layers
    W1 = tf.get_variable('W1', shape = filter_dimensions['W1'])
    W2 = tf.get_variable('W2', shape = filter_dimensions['W2'])

    parameters = {'W1': W1, 'W2': W2} 

    return parameters


def forward_propagation(X, parameters, strides):

    with tf.variable_scope('model1'):

        # first block
        Z1 = tf.nn.conv2d(X, parameters['W1'], strides['conv1'], padding = 'VALID')
        A1 = tf.nn.relu(Z1)
        P1 = tf.nn.max_pool(A1, ksize = strides['pool1'], strides = strides['pool1'], padding = 'VALID')

        # second block
        Z2 = tf.nn.conv2d(P1, parameters['W2'], strides['conv2'], padding = 'VALID')
        A2 = tf.nn.relu(Z2)
        P2 = tf.nn.max_pool(A2, ksize = strides['pool2'], strides = strides['pool2'], padding = 'VALID')

        # flatten
        F = tf.contrib.layers.flatten(P2)

        # dense block
        Z3 = tf.contrib.layers.fully_connected(F, 50)
        A3 = tf.nn.relu(Z3)

        # output
        Z4 = tf.contrib.layers.fully_connected(A3, 10, activation_fn = None)

    return Z4

我以前在Keras上有过经验,但是我在这里找不到问题所在。

1 个答案:

答案 0 :(得分:0)

我会先检查两件事:

#convert to one-hot encodings
Y_train = tf.one_hot(Y_train, depth = 10)
Y_test = tf.one_hot(Y_test, depth = 10)

检查此代码是否正在输出您期望的结果。

第二:如果看起来像您期望的那样,请再次检查模型初始化。

只需2美分