Tensorflow:损失值与准确性不一致

时间:2018-03-07 06:30:02

标签: python tensorflow machine-learning neural-network

我正在使用Tensorflow构建一个简单的单隐藏层神经网络。

对于输入,每行数据对应10个答案。每行的前2个元素是正确的,即与地面实况标签相同。相比之下,最后8个元素与地面实况标签相反。

例如,

[1, 1, 0, 0, 0, 0, 0, 0, 0, 0], correct is 1
[0, 0, 1, 1, 1, 1, 1, 1, 1, 1], correct is 0
[0, 0, 1, 1, 1, 1, 1, 1, 1, 1], correct is 0
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0], correct is 1

我希望我的神经网络了解前两个元素/特征总能给出正确的结果。因此,我希望网络能够为前两个功能赋予更大的权重。但是,网络总是会陷入一些损失值。

更有趣的是,准确度被视为从标签总数中正确预测的标签比例。使用sigmoid函数计算损失函数,即$ y * log(logit)+(1-y)* log(1-logit))$。有时,随着损失减少,准确性增加。例如,

epoch is:  0 loss is:  7.661093  accuracy value is:  1.0 
epoch is:  100 loss is:  7.579134  accuracy value is:  0.54545456 
epoch is:  200 loss is:  7.5791006  accuracy value is:  0.54545456 

我认为网络可以继续增加前两个元素的权重,直到它可以完全预测正确的标签。

任何人都可以告诉我,我该怎么做才能使网络正确预测标签,而不是卡住?

我的代码在这里:

import tensorflow as tf
import numpy as np


class SigmoidNeuralNetwork():
    def __init__(self, learning_rate, training_data, correct_labels, epoch_number):
        self.learning_rate = learning_rate
        self.training_data = training_data
        self.correct_labels = correct_labels

        self.X = tf.placeholder(tf.float32)
        self.y = tf.placeholder(tf.float32)

        self.feature_num = len(self.training_data[0])
        self.sample_num = len(self.training_data)

        self.W = tf.Variable(tf.random_uniform([self.feature_num, 1], -1.0, 1.0), dtype=tf.float32)
        self.b = tf.Variable([0.0])

        self.epoch_number = epoch_number

    def launch_network(self):
        db = tf.matmul(self.X, tf.reshape(self.W, [-1, 1])) + self.b
        hyp = tf.sigmoid(db)

        cost0 = self.y * tf.log(tf.clip_by_value(hyp, 1e-10, 1.0))
        cost1 = (1 - self.y) * tf.log(tf.clip_by_value((1 - hyp), 1e-10, 1.0))
        cost = (cost0 + cost1) / float(self.sample_num)
        loss = -tf.reduce_sum(cost)

        optimizer = tf.train.GradientDescentOptimizer(learning_rate=self.learning_rate)
        train = optimizer.minimize(loss)

        #
        new_train_X = self.training_data.astype(np.float32)

        output = tf.add(tf.matmul(new_train_X, self.W), self.b)
        prediction = tf.sigmoid(output)

        predicted_class = tf.greater(prediction, 0.5)
        ground_labels = tf.reshape(tf.equal(self.y, 1.0), predicted_class.shape)
        correct = tf.equal(predicted_class, ground_labels)
        accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
        #

        init = tf.global_variables_initializer()
        sess = tf.Session()
        sess.run(init)

        for epoch in range(self.epoch_number):
            _, loss_val, accuracy_val = sess.run([train, loss, accuracy], {self.X: self.training_data, self.y: self.correct_labels})

            if epoch % 100 == 0:
                print "epoch is: ", epoch, "loss is: ", loss_val, " accuracy value is: ", accuracy_val
                # print "weight is: ", sess.run(self.W).flatten()


train_data = np.array([
    [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
    [1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
])

correct_answers = np.array([1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1])

sigmoid_network = SigmoidNeuralNetwork(learning_rate=0.01, training_data=train_data, correct_labels=correct_answers,
                                       epoch_number=10000)

sigmoid_network.launch_network()

1 个答案:

答案 0 :(得分:1)

什么是问题?

OP写道:

  

我认为网络可以继续增加前两个元素的权重,直到它可以完全预测正确的标签。

你完全正确。

  

任何人都可以告诉我,我该怎么做才能使网络正确预测标签,而不是卡住?

问题出在函数launch_network()

def launch_network(self):
    db = tf.matmul(self.X, tf.reshape(self.W, [-1, 1])) + self.b
    hyp = tf.sigmoid(db)

    cost0 = self.y * tf.log(tf.clip_by_value(hyp, 1e-10, 1.0))
    ... (skip) ...

请注意,dbhyp具有相同的形状(self.sample_num, 1)(2-dim),但self.y(即correct_answers)的形状为(self.sample_num,)(1-dim)。

在获得cost0的第5行,您乘以self.y * tf.log(...hyp...)。因此,结果的形状变为(self.sample_num, self.sample_num),而不是(self.sample_num, 1)

建议解决方案

最简单的解决方案是将correct_answers的形状更改为(self.sample_num, 1)(2-dim),而不是(self.sample_num,)(1-dim),如下所示:

correct_answers = np.array([1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1])[:,np.newaxis]