如何在Tensorflow中使用单词嵌入进行预测

时间:2016-05-15 23:21:22

标签: tensorflow word-embedding

我正在尝试使用Tensorflow教程并且一直试图增强RNN/language model tutorial,以便我可以预测句子中的下一个单词。本教程使用单词嵌入作为单词的表示。

由于模型在单词嵌入中学习,我假设我添加的任何类型的预测都将输出相同的嵌入。我无法弄清楚如何将这些嵌入转换回数据集中的单词ID。我见过的唯一例子是在内存数据结构中保留了与wordid映射相反的方法 - >嵌入并用于查找。这显然不会解决所有问题。还有更好的方法吗?

1 个答案:

答案 0 :(得分:0)

假设你从词汇表中同时拥有word_to_idxidx_to_word,这就是你所做的伪代码

想象一下预测的输入是"这是样本"

batch_size = 1
num_steps = 3 # i.e each step for the word in "this is sample"
hidden_size = 1500
vocab_size = 10000
translate the `word_to_idx` for the input `"this is sample"`
get the word embeddings for the each word in the input
Input to model will be word embedding of size 1x1500 at each time step
Output of model at each time step will be of size 1x1500
y_pred is output at last step from model for the given input 
adding projection to output (i.e y_pred x Weights(hidden_size, vocab_size) + bias(vocab_size, )  = 1x10000)
now sample the output through the below function to get the index with max probability
generate idx_to_word from the index we just got
use the generated word along with the previous input to generate next word until you get `<eos>` or some predefined sentence stopping.

以下是here的抽样示例:

def sample(a, temperature=1.0):
    # helper function to sample an index from a probability array
    a = np.log(a) / temperature
    a = np.exp(a) / np.sum(np.exp(a))
    return np.argmax(np.random.multinomial(1, a, 1))