我正在为正在使用的网络使用LSTMBlockFusedCell
。
当输入单个大小为[6,3169]的输入和一个输出为[-1,3169]的输入(转换为输入大小)时,它将正常运行并进行预测。当我尝试批量输入相同的输入时,问题就来了。批量为100时,输入将重塑良好,但输出将广播到[600,3169]中。我尝试将占位符规范完全设置为指定的输入长度,但是发生了相同的错误。我非常有信心自己的数据格式正确。我运行批处理生成器,然后打印输出大小。
这是我的网络:
def rnn(x, weight, bias, n_input, vocab):
x = tf.reshape(x, [-1, n_input, vocab])
rnn_cell = tf.contrib.rnn.LSTMBlockFusedCell(n_hidden)
outputs, states = rnn_cell(x, dtype=tf.float32)
return tf.matmul(outputs[-1], weight['output']) + bias['output']
我的批处理生成器:
def new_data(dat, dic, n_steps, batch_size=100):
x = np.zeros(shape=(batch_size, n_steps, len(dic)))
y = np.zeros(shape=(batch_size, n_steps, len(dic)))
j = 0
x_dat = np.zeros(shape=(n_steps, len(dic)))
for sen in dat:
if len(sen) - 1 > n_steps:
for i, word in enumerate(sen[0:n_steps]):
x_dat[i] = one_hot(word, dic)
y_dat = one_hot(sen[n_steps], dic)
x[j % batch_size] = x_dat
y[j % batch_size] = y_dat
if j % batch_size == 0:
yield x,y
x = np.zeros(shape=(batch_size, n_steps, len(dic)))
y = np.zeros(shape=(batch_size, n_steps, len(dic)))
j += 1
和我的设置:
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
weights = {
"output" : tf.Variable(tf.random_normal([n_hidden, vocab_size]),name="weight_output")
}
bias = {
"output" : tf.Variable(tf.random_normal([vocab_size]), name="bias_output")
}
pred = rnn(X, weights, bias, n_input, vocab)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=pred, labels=Y))
with tf.Session() as sess:
sess.run(init)
step = 0
for epoch in range(n_epochs):
for x,y in new_data(dat, dic, n_steps):
_ , c = sess.run([optimizer, cost], feed_dict={X: x ,Y: y})