Question

我训练了一个RNN（一个特定的GRU）作为语言模型大致如下：

http.ServeMux

基本上在训练期间有一个“指南”输入，我从最后一步到当前步骤（教师强制？）输入目标令牌。

现在这个RNN训练良好，并收敛。但我不知道如何实际使用从头开始生成序列。由于gru期望一个显式输入参数，如何告诉它在不给它输入的情况下使用自己的输出？

基本上我想要

inputs = [<start>, tok1, tok2, tok3, . . .]

outputs = [tok1, tok2, tok3, . . .]

h0 = initial state of all-zeros

gru_outputs, hn = gru(inputs, h0)

cost = loss (gru_outputs, outputs)

无法弄清楚如何做到这一点。我是否需要使用不同的API并在培训和使用中手动推出？

如果有任何用处，请参阅此要点中的完整代码：https://gist.github.com/evanthebouncy/b5039dc72d3d9fea66dad3306e479e6b

谢谢！

- 埃文

Answer 1

因此，如果您想通过生成随机文本来测试您的语言模型 - 只需选择一个随机标记作为第一个单词。在输入这个随机单词后，模型将生成一个输出，然后您生成的这个单词将成为您输入模型的下一个输入，等等。

以下是一些示例代码：

# Test the model
with torch.no_grad():
    with open('sample.txt', 'w') as f:
        # Set intial hidden ane cell states
        state = (torch.zeros(num_layers, 1, hidden_size).to(device),
                 torch.zeros(num_layers, 1, hidden_size).to(device))

        # Select one word id randomly
        prob = torch.ones(vocab_size)
        input = torch.multinomial(prob, num_samples=1).unsqueeze(1).to(device)

        for i in range(num_samples):
            # Forward propagate RNN 
            output, state = model(input, state)

            # Sample a word id
            prob = output.exp()
            word_id = torch.multinomial(prob, num_samples=1).item()

            # Fill input with sampled word id for the next time step
            input.fill_(word_id)

            # File write
            word = corpus.dictionary.idx2word[word_id]
            word = '\n' if word == '<eos>' else word + ' '
            f.write(word)

            if (i+1) % 100 == 0:
                print('Sampled [{}/{}] words and save to {}'.format(i+1, num_samples, 'sample.txt'))

也可以找到模型的完整代码here。

您可能需要稍微修改一下代码，以使其适用于您的模型。

如何执行RNN的推出在pytorch中

1 个答案: