TensorFlow如何使`tf.nn.sampled_softmax_loss`的结果可重现

时间:2017-08-25 10:52:21

标签: python numpy random tensorflow

我想为我的tensorflow运行获得可重现的结果。我试图实现这一目标的方法是设置numpy和tensorflow种子:

import numpy as np
rnd_seed = 1
np.random.seed(rnd_seed)

import tensorflow as tf
tf.set_random_seed(rnd_seed)

同时确保我使用tf.truncated_normal初始化的神经网络的权重也使用该种子:tf.truncated_normal(..., seed=rnd_seed)

由于超出此问题范围的原因,我使用了采样的softmax损失函数tf.nn.sampled_softmax_loss,遗憾的是,我无法使用随机种子控制此函数的随机性。

通过查看此函数的TensorFlow文档(https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss),我可以看到参数sampled_values应该是影响随机化的唯一参数,但我无法理解如何实际上是使用种子。

[EDITED] 这是我脚本的一部分

import numpy as np
# set a seed so that the results are consistent
rnd_seed = 1
np.random.seed(rnd_seed)

import tensorflow as tf
tf.set_random_seed(rnd_seed)

embeddings_ini = np.random.uniform(low=-1, high=1, size=(self.vocabulary_size, self.embedding_size))

with graph.as_default(), tf.device('/cpu:0'):

    train_dataset = tf.placeholder(tf.int32, shape=[None, None])
    train_labels = tf.placeholder(tf.int32, shape=[None, 1])
    valid_dataset = tf.constant(self.valid_examples, dtype=tf.int32)

    # Variables.
    initial_embeddings = tf.placeholder(tf.float32, shape=(self.vocabulary_size, self.embedding_size))
    embeddings = tf.Variable(initial_embeddings)

    softmax_weights = tf.Variable(
        tf.truncated_normal([self.vocabulary_size, self.embedding_size],
                            stddev=1.0 / math.sqrt(self.embedding_size), seed=rnd_seed))
    softmax_biases = tf.Variable(tf.zeros([self.vocabulary_size]))

    # Model.
    # Look up embeddings for inputs.
    if self.model == "skipgrams":
        # Skipgram model
        embed = tf.nn.embedding_lookup(embeddings, train_dataset)
    elif self.model == "cbow":
        # CBOW Model
        embeds = tf.nn.embedding_lookup(embeddings, train_dataset)
        embed = tf.reduce_mean(embeds, 1, keep_dims=False)

    # Compute the softmax loss, using a sample of the negative labels each time.
    loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(weights=softmax_weights,
                                                     biases=softmax_biases,
                                                     inputs=embed,
                                                     labels=train_labels,
                                                     num_sampled=self.num_sampled,
                                                     num_classes=self.vocabulary_size))

1 个答案:

答案 0 :(得分:0)

我终于找到了如何使结果可重复。就像@Anis建议我应该设置图表种子一样,这可以通过以下方式完成:

with graph.as_default(), tf.device('/cpu:0'):
    tf.set_random_seed(1234)