优化器不能始终如一地提高训练准确性

时间:2018-05-28 02:06:25

标签: python tensorflow

我正在尝试训练一个逻辑回归模型,但无论我做多少小的训练集,训练准确性都不会持续增加。我将训练集缩减为3个例子,模型有时以66.66%的训练精度开始,最终以33.33%结束。其他时候从0%开始,以66.66%结束。它永远不会达到100%的准确性。它与大小为32,200和400的训练集具有相同的行为,起始精度约为50%,结束精度在40%到60%之间。

型号代码如下:

def get_batch(index, tensors, batch_size, nItems):
    xs, ys = tensors
    begin = index * batch_size
    end = min((index+1)*batch_size, nItems)
    y_b = ys[begin:end]

    (inds, vals, dsize) = xs
    nInds = inds[(begin <= inds[:,0]) & (inds[:,0] < end)] - np.array([begin, 0])
    nVals = vals[:nInds.shape[0]]
    nDsize = (end - begin, dsize[1])
    x_b = tf.SparseTensorValue(nInds, nVals, nDsize)
    return (x_b, y_b)

class OneLayerNet(object):
    def __init__(self, num_feats, num_outputs):
        self.batch_size = 3
        self.epochs = 100
        self.eta = 0.01
        self.reg_const = 0

        self.x = tf.sparse_placeholder(tf.float64, name="placeholderx") # num_sents x num_feats
        self.y = tf.placeholder(tf.float64, name="placeholdery") # 1 x num_sents
        self.w = tf.Variable(tf.random_normal([num_feats, num_outputs], stddev=0.01, dtype=tf.float64)) # num_feats x 1
        self.b = tf.Variable(tf.zeros([num_outputs], dtype=tf.float64))

        self.wx = tf.sparse_tensor_dense_matmul(self.x, self.w)
        self.scores = tf.add(self.wx, self.b)
        self.probs = 1 / (1 + tf.exp(-self.scores))
        self.probs = tf.clip_by_value(self.probs, 0.001, .999)
        self.loss_vect = self.y*tf.log(self.probs) + (1-self.y)*tf.log(1-self.probs)
        self.loss = -tf.reduce_mean(self.loss_vect) # + self.reg_const/2 * tf.square(tf.norm(self.w))
        self.optimizer = tf.train.AdamOptimizer(learning_rate=self.eta).minimize(self.loss)
        self.session = tf.Session()
        self.session.run(tf.global_variables_initializer())

    def train(self, x, y, loss_graph_file):
        session = self.session
        num_batches = y.shape[0] // self.batch_size
        loss_vect = []

        for epoch in range(self.epochs):
            avg_loss = 0
            for i in range(num_batches):
                batch_x, batch_y = get_batch(i, [x, y], self.batch_size, y.shape[0])
                _, loss, w = session.run([self.optimizer, self.loss, self.w], {self.x: batch_x, self.y: batch_y})
                avg_loss += loss/num_batches

            loss_vect.append(avg_loss)
            if epoch % 10 == 0 or epoch == self.epochs-1:
                print("Epoch {}: loss = {}".format(epoch, avg_loss))
                print("Weights: {}".format(w))

        plt.plot(loss_vect)
        plt.ylabel('Loss')
        plt.xlabel('Epoch')
        plt.savefig(loss_graph_file)

    def eval(self, x, y, predictions_file):
        session = self.session
        num_batches = y.shape[0] // self.batch_size
        num_correct = 0

        with open(predictions_file, 'w') as f:
            for i in range(num_batches + 1):
                batch_x, batch_y = get_batch(i, [x, y], self.batch_size, y.shape[0])
                probs = session.run(self.probs, {self.x: batch_x})
                predictions = np.transpose(probs >= 0.5)[0]
                num_correct += np.sum(np.equal(predictions, batch_y))
                for j in range(batch_y.shape[0]):
                    f.write('{}\t{}\t{}\n'.format(probs[j], int(predictions[j]), batch_y[j]))

        accuracy = num_correct/len(y)
        return accuracy

我已尝试this answer中的建议,但行为仍然相同。我正在使用Tensorflow 1.5.0。

更新 我打印出每个句子的softmax输出,每个都变得越来越接近50%。我尝试使用我的设置来学习AND功能。当训练时,重量变得越来越接近0。

Epoch 0: loss = 4.133313990920284
Weights: [[-0.59451162]
 [ 0.55122256]]
Bias: [-0.01]
Epoch 100: loss = 3.0849339200727615
Weights: [[-0.70471682]
 [-0.04904535]]
Bias: [-0.63568272]
Epoch 200: loss = 3.0166726382814177
Weights: [[-0.2748711 ]
 [-0.13774631]]
Bias: [-0.834027]
Epoch 300: loss = 3.004324396806258
Weights: [[-0.108655 ]
 [-0.1161173]]
Bias: [-0.95526422]
Epoch 400: loss = 3.0011826475632546
Weights: [[-0.04740128]
 [-0.06981994]]
Bias: [-1.02420669]
Epoch 500: loss = 3.0002812775795973
Weights: [[-0.02161358]
 [-0.03521941]]
Bias: [-1.06242562]
Epoch 600: loss = 3.0000558857071757
Weights: [[-0.0094973 ]
 [-0.01578322]]
Bias: [-1.08245493]
Epoch 700: loss = 3.00000916752074
Weights: [[-0.00384123]
 [-0.00638793]]
Bias: [-1.09205959]
Epoch 800: loss = 3.0000012291196088
Weights: [[-0.00140626]
 [-0.00233578]]
Bias: [-1.09621262]
Epoch 900: loss = 3.000000133321497
Weights: [[-0.00046284]
 [-0.00076831]]
Bias: [-1.09782245]
Epoch 1000: loss = 3.0000000115763847
Weights: [[-0.00013625]
 [-0.00022613]]
Bias: [-1.09837977]
Epoch 1100: loss = 3.0000000007953758
Weights: [[-3.56729609e-05]
 [-5.91996755e-05]]
Bias: [-1.09855141]
Epoch 1200: loss = 3.0000000000426725
Weights: [[-8.25235844e-06]
 [-1.36946603e-05]]
Bias: [-1.09859821]
Epoch 1300: loss = 3.0000000000017604
Weights: [[-1.67385710e-06]
 [-2.77772968e-06]]
Bias: [-1.09860943]
Epoch 1400: loss = 3.000000000000055
Weights: [[-2.95008595e-07]
 [-4.89560038e-07]]
Bias: [-1.09861179]
Epoch 1500: loss = 3.000000000000001
Weights: [[-4.46992207e-08]
 [-7.41773288e-08]]
Bias: [-1.09861221]
Epoch 1600: loss = 3.0
Weights: [[-5.74942725e-09]
 [-9.54104229e-09]]
Bias: [-1.09861228]
Epoch 1700: loss = 3.0
Weights: [[-6.18335872e-10]
 [-1.02611408e-09]]
Bias: [-1.09861229]
Epoch 1800: loss = 3.0
Weights: [[-5.45849278e-11]
 [-9.05823934e-11]]
Bias: [-1.09861229]
Epoch 1900: loss = 3.0
Weights: [[-3.86521516e-12]
 [-6.41401071e-12]]
Bias: [-1.09861229]
Epoch 1999: loss = 3.0
Weights: [[-2.19714497e-13]
 [-3.64640086e-13]]
Bias: [-1.09861229]

1 个答案:

答案 0 :(得分:0)

我建议您尝试使用xavier的权重初始化。这就像

W = tf.get_variable("W", 
                    shape=[x,y],               
                    initializer=tf.contrib.layers.xavier_initializer())

你知道x和y是层的形状。