在TensorFlow RNN中输出序列

时间:2017-07-03 17:21:13

标签: tensorflow neural-network recurrent-neural-network

我创建了一个简单的TensorFlow程序,尝试使用文本正文中的前3个字符预测下一个字符。

单个输入可能如下所示:

np.array(['t','h','i'])

目标是

np.array(['s'])

我正在尝试扩展它以输出下一个说4个字符而不仅仅是下一个字符。为了做到这一点,我尝试用更长的阵列喂y

np.array(['s','','i'])

除了将y更改为

之外
y = tf.placeholder(dtype=tf.int32, shape=[None, n_steps])

然而,这会产生错误:

  

排名不匹配:标签排名(收到2)应该等于logits排名   减1(收到2)。

这是完整的代码

embedding_size=40
n_neurons = 200
n_output = vocab_size
learning_rate = 0.001

with tf.Graph().as_default():
    x = tf.placeholder(dtype=tf.int32, shape=[None, n_steps])
    y = tf.placeholder(dtype=tf.int32, shape=[None])
    seq_length = tf.placeholder(tf.int32, [None])

    # Let's set up the embedding converting words to vectors
    embeddings = tf.Variable(tf.random_uniform(shape=[vocab_size, embedding_size], minval=-1, maxval=1))
    train_input = tf.nn.embedding_lookup(embeddings, x)

    basic_cell = tf.nn.rnn_cell.GRUCell(num_units=n_neurons)
    outputs, states = tf.nn.dynamic_rnn(basic_cell, train_input, sequence_length=seq_length, dtype=tf.float32)

    logits = tf.layers.dense(states, units=vocab_size, activation=None)
    predictions = tf.nn.softmax(logits)
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
        labels=y,
        logits=logits)
    loss = tf.reduce_mean(xentropy)
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
    training_op = optimizer.minimize(loss)   

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for r in range(1000):
            x_batch, y_batch, seq_length_batch = input_fn()
            feed_dict = {x: x_batch, y: y_batch, seq_length: seq_length_batch}
            _, loss_out = sess.run([training_op, loss], feed_dict=feed_dict)
            if r % 1000 == 0:
                print("loss_out", loss_out)

        sample_text = "for th"
        sample_text_ids = np.expand_dims(np.array([w_to_id[c] for c in sample_text]+[0, 0], dtype=np.int32), 0)
        prediction_out = sess.run(predictions, feed_dict={x: sample_text_ids, seq_length: np.array([len(sample_text)])})
        print("Result:", id_to_w[np.argmax(prediction_out)])    

1 个答案:

答案 0 :(得分:0)

如果是多对多RNN,您应该使用tf.contrib.seq2seq.sequence_loss来计算每个步骤的丢失。您的代码应如下所示:

...
logits = tf.layers.dense(states, units=vocab_size, activation=None)
weights = tf.sequence_mask(seq_length, n_steps)
xentropy = tf.contrib.seq2seq.sequence_loss(logits, y, weights)
...

有关tf.contrib.seq2seq.sequence_loss的详细信息,请参阅here