Question

我尝试运行word2vec模型。它正在工作，但我感到困惑，为什么损失永远不会减少。它总是保持300左右。

请给我一些有关它发生的原因以及解决方法的提示。

我粘贴了一部分关于图形的代码，希望对您有所帮助。

graph = tf.Graph()
with graph.as_default():
    train_inputs = tf.placeholder(tf.int32, shape=[batch_size],name='train_inputs')
    train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1],name='train_labels')
    valid_dataset = tf.constant(valid_examples, dtype=tf.int32)


    with tf.device('/cpu:0'):
        with tf.name_scope('Embeddings'):
            embeddings = tf.Variable(tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0),name='Embeddings')
            embed = tf.nn.embedding_lookup(embeddings, train_inputs)
            tf.summary.histogram(name ='Embeddings', values = embeddings)

        with tf.name_scope('Weights'):
            nce_weights = tf.Variable(tf.truncated_normal([vocabulary_size, embedding_size],stddev=1.0 / math.sqrt(embedding_size)),name='Weights')
            tf.summary.histogram(name ='Weights', values = nce_weights)

        with tf.name_scope('Biases'):
            nce_biases = tf.Variable(tf.zeros([vocabulary_size]),dtype=tf.float32,name='Biases')
            tf.summary.histogram(name ='Biases', values = nce_biases)

    with tf.name_scope('Loss'):
        output_layer = tf.nn.nce_loss(weights=nce_weights,
                                  biases=nce_biases,
                                  inputs=embed,
                                  labels=train_labels,
                                  num_sampled=num_sampled,
                                  num_classes=vocabulary_size)
        loss = tf.reduce_mean(output_layer)
        tf.summary.scalar('loss', loss)

    with tf.name_scope('Optimizer'):
        #optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
        optimizer = tf.train.AdamOptimizer(1.0).minimize(loss)

    with tf.name_scope('normalized'):
        norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
        normalized_embeddings = embeddings / norm
        valid_embeddings = tf.nn.embedding_lookup(normalized_embeddings, valid_dataset)
    merged = tf.summary.merge_all()
    init = tf.global_variables_initializer()

然后训练步骤是这样的：

with tf.Session(graph=graph) as sess:
# We must initialize all variables before we use them.
    init.run()
    logging.info("Initialized")
    writer = tf.summary.FileWriter("TB/", graph = sess.graph)#TensorBoard
    for step in xrange(num_steps):
        batch_inputs, batch_labels = generate_batch(batch_size, num_skips, skip_window)


        nextDict = {train_inputs: batch_inputs, train_labels: batch_labels}

        stepMerged,stepLoss = sess.run([merged,loss],feed_dict=nextDict)
        logging.info("loss for this step("+str(step)+"):"+str(float(stepLoss)))

编辑1：

我截图了TensorBoard报告，但这是暂时的，因为培训仍在继续。

现在培训步骤已通过了12万次（计划为20万次），大约需要22个小时。

temporarily loss report

如何减少张量流在Word2Vec模型中的损失

0 个答案: