说,我想使用预先训练的RNN(例如a_n
)从单个向量q
生成序列GRUCell
;我想这样做的方式是这样的:
q[0] := q
q[n+1] := GRU(q[n])
在下一步中,基本上在一步上使用自己的输出作为输入。问题是--- GRUCell需要一次指定整个输入序列,我显然不能这样做。我可以尝试这样做:
for i in range(100):
GRU.state = GRU.zero_state()
q[i+1] := GRU(q[:i])[i]
# (obviously a bad pseudocode)
但这似乎不优雅,没有优化,我无法弄清楚如何应用截断的backprop(一次只提供整个生成序列的一部分......?)。
那么有更好的方法吗,或者我应该坚持这种丑陋的方法?
答案 0 :(得分:0)
您可以使用我制作的要点:
https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31
制作一个lambda,将输出空间转换为输入空间。
随时问任何问题!
来源:
import tensorflow as tf
def self_feeding_rnn(cell, seqlen, Hin, Xin, processing=tf.identity):
'''Unroll cell by feeding output (hidden_state) of cell back into in as input.
Outputs are passed through `processing`. It is up to the caller to ensure that the processed
outputs have suitable shape to be input.'''
veclen = tf.shape(Xin)[-1]
# this will grow from [ BATCHSIZE, 0, VELCEN ] to [ BATCHSIZE, SEQLEN, VECLEN ]
buffer = tf.TensorArray(dtype=tf.float32, size=seqlen)
initial_state = (0, Hin, Xin, buffer)
condition = lambda i, *_: i < seqlen
print(initial_state)
def do_time_step(i, state, xo, ta):
Yt, Ht = cell(xo, state)
Yro = processing(Yt)
return (1+i, Ht, Yro, ta.write(i, Yro))
_, Hout, _, final_ta = tf.while_loop(condition, do_time_step, initial_state)
ta_stack = final_ta.stack()
Yo = tf.reshape(ta_stack,shape=((-1, seqlen, veclen)))
return Yo, Hout