使用fit_generator的基本RNN训练不会输出预期的形状

时间:2019-07-18 08:34:55

标签: tensorflow keras lstm recurrent-neural-network

我正在使用Keras实现由512个单位的GRU和一个密集层组成的基本RNN:

model = Sequential()
model.add(GRU(units=512,
              return_sequences=True,
              input_shape=(None, num_x_signals,)))
model.add(Dense(num_y_signals, activation='sigmoid'))

我需要即时生成输入批次,因此我使用了fit_generator

model.fit_generator(generator=generator_train, epochs=NB_EPOCHS, steps_per_epoch=STEPS_PER_EPOCH,
                        validation_data=generator_test, validation_steps=900, callbacks=callbacks)

这是我定义批处理生成器的方式:

SAMPLE_PERIOD_PER_INPUT = 1728
PERIOD_TO_PREDICT = 288
BATCH_SIZE = 64

def batch_generator(batch_size, sequence_length, train = True):
    while True:
        x_shape = (batch_size, sequence_length, num_x_signals)
        x_batch = np.zeros(shape=x_shape, dtype=np.float16)

        y_shape = (batch_size, PERIOD_TO_PREDICT, num_y_signals)
        y_batch = np.zeros(shape=y_shape, dtype=np.float16)

        for i in range(batch_size):
            if train:
                idx = np.random.randint(num_train - sequence_length)

                predict_idx = (idx + sequence_length) - PERIOD_TO_PREDICT

                x_batch[i] = x_train_scaled[idx:idx+sequence_length]
                y_batch[i] = y_train_scaled[predict_idx:idx+sequence_length]
            else:
                idx = np.random.randint(num_test - sequence_length)

                predict_idx = (idx + sequence_length) - PERIOD_TO_PREDICT

                x_batch[i] = x_test_scaled[idx:idx+sequence_length]
                y_batch[i] = y_test_scaled[predict_idx:idx+sequence_length]

        yield (x_batch, y_batch)

generator_train = batch_generator(batch_size=BATCH_SIZE, sequence_length=SAMPLE_PERIOD_PER_INPUT)
generator_test = batch_generator(batch_size=BATCH_SIZE, sequence_length=SAMPLE_PERIOD_PER_INPUT, train = False)

我还使用了“自定义”损失函数,因为我需要忽略第一个计算出的序列,该序列应该不正确:

warmup_steps = 50

def loss_mse_warmup(y_true, y_pred):
    y_true_slice = y_true[:, warmup_steps:, :]
    y_pred_slice = y_pred[:, warmup_steps:, :]

    loss = tf.losses.mean_squared_error(labels=y_true_slice,
                                        predictions=y_pred_slice)

    loss_mean = tf.reduce_mean(loss)

    return loss_mean

optimizer = RMSprop(lr=1e-3)
model.compile(loss=loss_mse_warmup, optimizer=optimizer)

这是我的模型摘要:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
gru (GRU)                    (None, None, 512)         798720    
_________________________________________________________________
dense (Dense)                (None, None, 1)           513       
=================================================================
Total params: 799,233
Trainable params: 799,233
Non-trainable params: 0
_________________________________________________________________

但是当我运行它时,它说有形状错误:

2 root error(s) found.
  (0) Invalid argument: Incompatible shapes: [64,238,1] vs. [64,1678,1]
     [[{{node loss_4/dense_loss/mean_squared_error/SquaredDifference}}]]
     [[loss_4/mul/_167]]
  (1) Invalid argument: Incompatible shapes: [64,238,1] vs. [64,1678,1]
     [[{{node loss_4/dense_loss/mean_squared_error/SquaredDifference}}]]
0 successful operations.
0 derived errors ignored.

任何想法为何?我在哪里写错了什么?

0 个答案:

没有答案