Question

我想逐步训练单词向量/嵌入。对于每次增量运行，我想扩展模型的词汇表并向嵌入矩阵添加新行。

嵌入矩阵是一个分区变量，所以理想情况下我想避免使用assign，因为它没有为分区变量实现。

我试过的一种方式，看起来像这样：

        # Set prev_vocab_size and new_vocab_size 
        #accordingly to the corpus/text of the current run

        prev_embeddings = tf.get_variable(
            'prev_embeddings',
            shape=[prev_vocab_size, FLAGS.embedding_size],
            dtype=tf.float32,
            initializer=tf.random_uniform_initializer(-1.0, 1.0)
        )

        new_embeddings = tf.get_variable(
            'new_embeddings',
            shape=[new_vocab_to_add,
                   FLAGS.embedding_size],
            dtype=tf.float32,
            initializer=tf.random_uniform_initializer(
                -1.0, 1.0)
        )

        combined_embeddings = tf.concat(
            [prev_embeddings, new_embeddings], 0)

        embeddings = tf.Variable(
            combined_embeddings,
            expected_shape=[prev_vocab_size + new_vocab_to_add, FLAGS.embedding_size],
            dtype=tf.float32,
            name='embeddings')

现在，这适用于第一次运行。但是如果我进行第二次运行，我会收到Assign requires shapes of both tensors to match错误，因为恢复的原始prev_embeddings变量（来自第一次运行）与我声明的新形状（基于扩展的词汇）不匹配第二轮。

所以我修改了tf.train.Saver，将new_embeddings保存为prev_embeddings，如下所示：

saver = tf.train.Saver({"prev_embeddings": new_embeddings})

现在，在第二次运行中，prev_embeddings在前一次运行中的形状为new_embeddings，我没有收到错误。

但是，现在第二次运行中的new_embeddings具有与第一次运行中不同的形状，因此在从第一次运行中恢复变量时，我得到另一个Assign requires shapes of both tensors to match错误。

在保留旧的和经过训练的向量的同时，使用词汇表中新单词的新向量逐步扩展/扩展嵌入变量的最佳方法是什么？

非常感谢任何帮助。

使用Tensorflow扩展word嵌入层以进行增量word2vec培训

0 个答案: