我正在尝试训练一个逻辑回归模型,但无论我做多少小的训练集,训练准确性都不会持续增加。我将训练集缩减为3个例子,模型有时以66.66%的训练精度开始,最终以33.33%结束。其他时候从0%开始,以66.66%结束。它永远不会达到100%的准确性。它与大小为32,200和400的训练集具有相同的行为,起始精度约为50%,结束精度在40%到60%之间。
型号代码如下:
def get_batch(index, tensors, batch_size, nItems):
xs, ys = tensors
begin = index * batch_size
end = min((index+1)*batch_size, nItems)
y_b = ys[begin:end]
(inds, vals, dsize) = xs
nInds = inds[(begin <= inds[:,0]) & (inds[:,0] < end)] - np.array([begin, 0])
nVals = vals[:nInds.shape[0]]
nDsize = (end - begin, dsize[1])
x_b = tf.SparseTensorValue(nInds, nVals, nDsize)
return (x_b, y_b)
class OneLayerNet(object):
def __init__(self, num_feats, num_outputs):
self.batch_size = 3
self.epochs = 100
self.eta = 0.01
self.reg_const = 0
self.x = tf.sparse_placeholder(tf.float64, name="placeholderx") # num_sents x num_feats
self.y = tf.placeholder(tf.float64, name="placeholdery") # 1 x num_sents
self.w = tf.Variable(tf.random_normal([num_feats, num_outputs], stddev=0.01, dtype=tf.float64)) # num_feats x 1
self.b = tf.Variable(tf.zeros([num_outputs], dtype=tf.float64))
self.wx = tf.sparse_tensor_dense_matmul(self.x, self.w)
self.scores = tf.add(self.wx, self.b)
self.probs = 1 / (1 + tf.exp(-self.scores))
self.probs = tf.clip_by_value(self.probs, 0.001, .999)
self.loss_vect = self.y*tf.log(self.probs) + (1-self.y)*tf.log(1-self.probs)
self.loss = -tf.reduce_mean(self.loss_vect) # + self.reg_const/2 * tf.square(tf.norm(self.w))
self.optimizer = tf.train.AdamOptimizer(learning_rate=self.eta).minimize(self.loss)
self.session = tf.Session()
self.session.run(tf.global_variables_initializer())
def train(self, x, y, loss_graph_file):
session = self.session
num_batches = y.shape[0] // self.batch_size
loss_vect = []
for epoch in range(self.epochs):
avg_loss = 0
for i in range(num_batches):
batch_x, batch_y = get_batch(i, [x, y], self.batch_size, y.shape[0])
_, loss, w = session.run([self.optimizer, self.loss, self.w], {self.x: batch_x, self.y: batch_y})
avg_loss += loss/num_batches
loss_vect.append(avg_loss)
if epoch % 10 == 0 or epoch == self.epochs-1:
print("Epoch {}: loss = {}".format(epoch, avg_loss))
print("Weights: {}".format(w))
plt.plot(loss_vect)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.savefig(loss_graph_file)
def eval(self, x, y, predictions_file):
session = self.session
num_batches = y.shape[0] // self.batch_size
num_correct = 0
with open(predictions_file, 'w') as f:
for i in range(num_batches + 1):
batch_x, batch_y = get_batch(i, [x, y], self.batch_size, y.shape[0])
probs = session.run(self.probs, {self.x: batch_x})
predictions = np.transpose(probs >= 0.5)[0]
num_correct += np.sum(np.equal(predictions, batch_y))
for j in range(batch_y.shape[0]):
f.write('{}\t{}\t{}\n'.format(probs[j], int(predictions[j]), batch_y[j]))
accuracy = num_correct/len(y)
return accuracy
我已尝试this answer中的建议,但行为仍然相同。我正在使用Tensorflow 1.5.0。
更新 我打印出每个句子的softmax输出,每个都变得越来越接近50%。我尝试使用我的设置来学习AND功能。当训练时,重量变得越来越接近0。
Epoch 0: loss = 4.133313990920284
Weights: [[-0.59451162]
[ 0.55122256]]
Bias: [-0.01]
Epoch 100: loss = 3.0849339200727615
Weights: [[-0.70471682]
[-0.04904535]]
Bias: [-0.63568272]
Epoch 200: loss = 3.0166726382814177
Weights: [[-0.2748711 ]
[-0.13774631]]
Bias: [-0.834027]
Epoch 300: loss = 3.004324396806258
Weights: [[-0.108655 ]
[-0.1161173]]
Bias: [-0.95526422]
Epoch 400: loss = 3.0011826475632546
Weights: [[-0.04740128]
[-0.06981994]]
Bias: [-1.02420669]
Epoch 500: loss = 3.0002812775795973
Weights: [[-0.02161358]
[-0.03521941]]
Bias: [-1.06242562]
Epoch 600: loss = 3.0000558857071757
Weights: [[-0.0094973 ]
[-0.01578322]]
Bias: [-1.08245493]
Epoch 700: loss = 3.00000916752074
Weights: [[-0.00384123]
[-0.00638793]]
Bias: [-1.09205959]
Epoch 800: loss = 3.0000012291196088
Weights: [[-0.00140626]
[-0.00233578]]
Bias: [-1.09621262]
Epoch 900: loss = 3.000000133321497
Weights: [[-0.00046284]
[-0.00076831]]
Bias: [-1.09782245]
Epoch 1000: loss = 3.0000000115763847
Weights: [[-0.00013625]
[-0.00022613]]
Bias: [-1.09837977]
Epoch 1100: loss = 3.0000000007953758
Weights: [[-3.56729609e-05]
[-5.91996755e-05]]
Bias: [-1.09855141]
Epoch 1200: loss = 3.0000000000426725
Weights: [[-8.25235844e-06]
[-1.36946603e-05]]
Bias: [-1.09859821]
Epoch 1300: loss = 3.0000000000017604
Weights: [[-1.67385710e-06]
[-2.77772968e-06]]
Bias: [-1.09860943]
Epoch 1400: loss = 3.000000000000055
Weights: [[-2.95008595e-07]
[-4.89560038e-07]]
Bias: [-1.09861179]
Epoch 1500: loss = 3.000000000000001
Weights: [[-4.46992207e-08]
[-7.41773288e-08]]
Bias: [-1.09861221]
Epoch 1600: loss = 3.0
Weights: [[-5.74942725e-09]
[-9.54104229e-09]]
Bias: [-1.09861228]
Epoch 1700: loss = 3.0
Weights: [[-6.18335872e-10]
[-1.02611408e-09]]
Bias: [-1.09861229]
Epoch 1800: loss = 3.0
Weights: [[-5.45849278e-11]
[-9.05823934e-11]]
Bias: [-1.09861229]
Epoch 1900: loss = 3.0
Weights: [[-3.86521516e-12]
[-6.41401071e-12]]
Bias: [-1.09861229]
Epoch 1999: loss = 3.0
Weights: [[-2.19714497e-13]
[-3.64640086e-13]]
Bias: [-1.09861229]
答案 0 :(得分:0)
我建议您尝试使用xavier的权重初始化。这就像
W = tf.get_variable("W",
shape=[x,y],
initializer=tf.contrib.layers.xavier_initializer())
你知道x和y是层的形状。