我正在使用Tensorflow编写一些CNN,但我遇到了一些非常奇怪的问题。
我第一步的代码是:
def rnn_one_step(x_t, h_t):
print('Shapes of the imputs : x_t : {}, h_t : {}'.format(x_t.get_shape(), h_t.get_shape()))
print('reshape size : x_t {}'.format(tf.reshape(x_t, [-1, 1]).get_shape()))
# convert character id into embedding
x_t_emb = embed_x(tf.reshape(x_t, [-1, 1]))[:, 0]
print('Shape of embeded : x_t_emb {}'.format(x_t_emb.get_shape()))
# concatenate x_t embedding and previous h_t state
x_and_h = tf.concat([x_t_emb, h_t], 1)
print('Shape of concatenated : x_and_h {}'.format(x_and_h.get_shape()))
print('Shape of reshaped concatenated : x_and_h {}'.format(tf.reshape(x_and_h, [-1, 1]).get_shape()))
# compute next state given x_and_h
h_next = get_h_next(tf.reshape(x_and_h, [-1, 1]))
print('Shape of state : h_next {}'.format(h_next.get_shape()))
print('Shape of reshaped state : h_next {}'.format(tf.reshape(h_next, [-1, 1]).get_shape()))
# get probabilities for language model P(x_next|h_next)
output_probas = get_probas(tf.reshape(h_next, [-1, 1]))
print('Shape of prob : output_probas {}'.format(output_probas.get_shape()))
return output_probas, h_next
在定义了该步骤之后,我有了:
input_sequence = tf.placeholder(tf.int32, (None, MAX_LENGTH)) # batch of token ids
batch_size = tf.shape(input_sequence)[0]
predicted_probas = []
h_prev = tf.zeros([batch_size, rnn_num_units]) # initial hidden state
for t in range(MAX_LENGTH):
x_t = input_sequence[:, t] # column t
probas_next, h_next = rnn_one_step(x_t, h_prev)
h_prev = h_next
predicted_probas.append(probas_next)
# combine predicted_probas into [batch, time, n_tokens] tensor
predicted_probas = tf.transpose(tf.stack(predicted_probas), [1, 0, 2])
# next to last token prediction is not needed
predicted_probas = predicted_probas[:, :-1, :]
它给了我一些不错的输出:
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
Shapes of the imputs : x_t : (?,), h_t : (?, 64)
reshape size : x_t (?, 1)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
Shape of reshaped concatenated : x_and_h (?, 1)
Shape of state : h_next (?, 64)
Shape of reshaped state : h_next (?, 1)
Shape of prob : output_probas (?, 54)
但是当我这样运行时:
s.run(tf.global_variables_initializer())
batch_size = 32
history = []
for i in range(1000):
batch = to_matrix(sample(names, batch_size), max_len=MAX_LENGTH)
print(batch)
loss_i, _ = s.run([loss, optimize], {input_sequence: batch})
history.append(loss_i)
但是当我运行图形时,我得到一个奇怪的错误:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [32,16] vs. shape[1] = [2560,64]
[[Node: concat_4 = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](strided_slice_10, next_h/Relu, concat_4/axis)]]
我可以说问题出在x_and_h = tf.concat([x_t_emb, h_t], 1)
中,因为一个张量是维度[32,16]
,另一个张量是[2560,64]
那是怎么回事,因为在上面的输出中,我总是将串联维数设为
... h_t : (?, 64)
Shape of embeded : x_t_emb (?, 16)
Shape of concatenated : x_and_h (?, 80)
h_t
变成[2560,64]
而不是[32, 64]
了吗?