我尝试运行word2vec模型。它正在工作,但我感到困惑,为什么损失永远不会减少。它总是保持300左右。
请给我一些有关它发生的原因以及解决方法的提示。
我粘贴了一部分关于图形的代码,希望对您有所帮助。
graph = tf.Graph()
with graph.as_default():
train_inputs = tf.placeholder(tf.int32, shape=[batch_size],name='train_inputs')
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1],name='train_labels')
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
with tf.device('/cpu:0'):
with tf.name_scope('Embeddings'):
embeddings = tf.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0),name='Embeddings')
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
tf.summary.histogram(name ='Embeddings', values = embeddings)
with tf.name_scope('Weights'):
nce_weights = tf.Variable(tf.truncated_normal([vocabulary_size, embedding_size],stddev=1.0 / math.sqrt(embedding_size)),name='Weights')
tf.summary.histogram(name ='Weights', values = nce_weights)
with tf.name_scope('Biases'):
nce_biases = tf.Variable(tf.zeros([vocabulary_size]),dtype=tf.float32,name='Biases')
tf.summary.histogram(name ='Biases', values = nce_biases)
with tf.name_scope('Loss'):
output_layer = tf.nn.nce_loss(weights=nce_weights,
biases=nce_biases,
inputs=embed,
labels=train_labels,
num_sampled=num_sampled,
num_classes=vocabulary_size)
loss = tf.reduce_mean(output_layer)
tf.summary.scalar('loss', loss)
with tf.name_scope('Optimizer'):
#optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
optimizer = tf.train.AdamOptimizer(1.0).minimize(loss)
with tf.name_scope('normalized'):
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
valid_embeddings = tf.nn.embedding_lookup(normalized_embeddings, valid_dataset)
merged = tf.summary.merge_all()
init = tf.global_variables_initializer()
然后训练步骤是这样的:
with tf.Session(graph=graph) as sess:
# We must initialize all variables before we use them.
init.run()
logging.info("Initialized")
writer = tf.summary.FileWriter("TB/", graph = sess.graph)#TensorBoard
for step in xrange(num_steps):
batch_inputs, batch_labels = generate_batch(batch_size, num_skips, skip_window)
nextDict = {train_inputs: batch_inputs, train_labels: batch_labels}
stepMerged,stepLoss = sess.run([merged,loss],feed_dict=nextDict)
logging.info("loss for this step("+str(step)+"):"+str(float(stepLoss)))
编辑1:
我截图了TensorBoard报告,但这是暂时的,因为培训仍在继续。
现在培训步骤已通过了12万次(计划为20万次),大约需要22个小时。