我正在训练用于预测角色的PTB数据集(即角色级LSTM)
训练批次的维度是[len(dataset)x vocabulary_size]。这里,vocabulary_size = 27(26 + 1 [对于unk令牌和空格或者fullstops。])。
这是为批次输入(arrX)和标签(arrY)转换为one_hot的代码。
arrX = np.zeros((len(train_data), vocabulary_size), dtype=np.float32)
arrY = np.zeros((len(train_data)-1, vocabulary_size), dtype=np.float32)
for i, x in enumerate(train_data):
arrX[i, x] = 1
arrY = arrX[1, :]
我在Graph中创建了一个输入(X)和标签(Y)的占位符,以将其传递给tflearn LSTM。以下是图形和会话的代码。
batch_size = 256
with tf.Graph().as_default():
X = tf.placeholder(shape=(None, vocabulary_size), dtype=tf.float32)
Y = tf.placeholder(shape=(None, vocabulary_size), dtype=tf.float32)
print (utils.get_incoming_shape(tf.concat(0, Y)))
print (utils.get_incoming_shape(X))
net = tflearn.lstm(X, 512, return_seq=True)
print (utils.get_incoming_shape(net))
net = tflearn.dropout(net, 0.5)
print (utils.get_incoming_shape(net))
net = tflearn.lstm(net, 256)
net = tflearn.fully_connected(net, vocabulary_size, activation='softmax')
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(net, Y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
offset=0
avg_cost = 0
total_batch = (train_length-1) / 256
print ("No. of batches:", '%d' %total_batch)
for i in range(total_batch) :
batch_xs, batch_ys = trainX[offset : batch_size + offset], trainY[offset : batch_size + offset]
sess.run(optimizer, feed_dict={X: batch_xs, Y: batch_ys})
cost = sess.run(loss, feed_dict={X: batch_xs, Y: batch_ys})
avg_cost += cost/total_batch
if i % 20 == 0:
print("Step:", '%03d' % i, "Loss:", str(cost))
offset += batch_size
所以,我收到以下错误assert ndim >= 3, "Input dim should be at least 3."
AssertionError: Input dim should be at least 3.
我怎样才能resolve this error
?有替代解决方案吗?
我应该写单独的LSTM定义吗?
答案 0 :(得分:0)
我不习惯这些数据集,但您是否尝试过将tflearn.input_data(shape)与tflearn.embedding图层一起使用?如果您使用嵌入,我认为您不必在三维中重塑数据。
答案 1 :(得分:0)
lstm图层输入形状3-D Tensor [samples,timesteps,input dim]。您可以将输入数据重新整形为3D。在您的问题中,trainX
的形状为[len(dataset) x vocabulary_size]
。使用trainX = trainX.reshape( trainX.shape+ (1,))
形状将更改为[len(dataset), vocabulary_size, 1]
。通过输入占位符X
中的简单更改,可以将此数据传递给lstm。通过X = tf.placeholder(shape=(None, vocabulary_size, 1), dtype=tf.float32)
向占位符添加一个维度。