我正在使用Keras实现由512个单位的GRU和一个密集层组成的基本RNN:
model = Sequential()
model.add(GRU(units=512,
return_sequences=True,
input_shape=(None, num_x_signals,)))
model.add(Dense(num_y_signals, activation='sigmoid'))
我需要即时生成输入批次,因此我使用了fit_generator
:
model.fit_generator(generator=generator_train, epochs=NB_EPOCHS, steps_per_epoch=STEPS_PER_EPOCH,
validation_data=generator_test, validation_steps=900, callbacks=callbacks)
这是我定义批处理生成器的方式:
SAMPLE_PERIOD_PER_INPUT = 1728
PERIOD_TO_PREDICT = 288
BATCH_SIZE = 64
def batch_generator(batch_size, sequence_length, train = True):
while True:
x_shape = (batch_size, sequence_length, num_x_signals)
x_batch = np.zeros(shape=x_shape, dtype=np.float16)
y_shape = (batch_size, PERIOD_TO_PREDICT, num_y_signals)
y_batch = np.zeros(shape=y_shape, dtype=np.float16)
for i in range(batch_size):
if train:
idx = np.random.randint(num_train - sequence_length)
predict_idx = (idx + sequence_length) - PERIOD_TO_PREDICT
x_batch[i] = x_train_scaled[idx:idx+sequence_length]
y_batch[i] = y_train_scaled[predict_idx:idx+sequence_length]
else:
idx = np.random.randint(num_test - sequence_length)
predict_idx = (idx + sequence_length) - PERIOD_TO_PREDICT
x_batch[i] = x_test_scaled[idx:idx+sequence_length]
y_batch[i] = y_test_scaled[predict_idx:idx+sequence_length]
yield (x_batch, y_batch)
generator_train = batch_generator(batch_size=BATCH_SIZE, sequence_length=SAMPLE_PERIOD_PER_INPUT)
generator_test = batch_generator(batch_size=BATCH_SIZE, sequence_length=SAMPLE_PERIOD_PER_INPUT, train = False)
我还使用了“自定义”损失函数,因为我需要忽略第一个计算出的序列,该序列应该不正确:
warmup_steps = 50
def loss_mse_warmup(y_true, y_pred):
y_true_slice = y_true[:, warmup_steps:, :]
y_pred_slice = y_pred[:, warmup_steps:, :]
loss = tf.losses.mean_squared_error(labels=y_true_slice,
predictions=y_pred_slice)
loss_mean = tf.reduce_mean(loss)
return loss_mean
optimizer = RMSprop(lr=1e-3)
model.compile(loss=loss_mse_warmup, optimizer=optimizer)
这是我的模型摘要:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
gru (GRU) (None, None, 512) 798720
_________________________________________________________________
dense (Dense) (None, None, 1) 513
=================================================================
Total params: 799,233
Trainable params: 799,233
Non-trainable params: 0
_________________________________________________________________
但是当我运行它时,它说有形状错误:
2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [64,238,1] vs. [64,1678,1]
[[{{node loss_4/dense_loss/mean_squared_error/SquaredDifference}}]]
[[loss_4/mul/_167]]
(1) Invalid argument: Incompatible shapes: [64,238,1] vs. [64,1678,1]
[[{{node loss_4/dense_loss/mean_squared_error/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.
任何想法为何?我在哪里写错了什么?