我正在一个项目中,我必须用y个项来预测一维矢量的未来状态。我正在尝试使用结合LSTM单元和卷积层的ANN设置来执行此操作。我使用的方法基于(pre-release paper)中使用的方法。建议的设置如下:
在图片中c是带有y项的一维矢量。 ANN获取前n个状态作为输入,并生成o个下一个状态作为输出。
当前,我的ANN设置如下:
inputLayer = Input(shape = (n, y))
encoder = LSTM(200)(inputLayer)
x = RepeatVector(1)(encoder)
decoder = LSTM(200, return_sequences=True)(x)
x = Conv1D(y, 4, activation = 'linear', padding = 'same')(decoder)
model = Model(inputLayer, x)
这里n是输入序列的长度,y是状态数组的长度。可以看出,我仅将d向量重复1次,因为我试图预测将来仅1个时间步长。这是设置上述网络的方法吗?
此外,我有一个numpy数组(数据),其形状为(序列,时间步长,状态变量)以进行训练。我试图用这样的生成器将它随机分成几批:
def BatchGenerator(batch_size, n, y, data):
# Infinite loop.
while True:
# Allocate a new array for the batch of input-signals.
x_shape = (batch_size, n, y)
x_batch = np.zeros(shape=x_shape, dtype=np.float16)
# Allocate a new array for the batch of output-signals.
y_shape = (batch_size, 1, y)
y_batch = np.zeros(shape=y_shape, dtype=np.float16)
# Fill the batch with random sequences of data.
for i in range(batch_size):
# Select a random sequence
seq_idx = np.random.randint(data.shape[0])
# Get a random start-index.
# This points somewhere into the training-data.
start_idx = np.random.randint(data.shape[1] - n)
# Copy the sequences of data starting at this
# Each batch inside x_batch has a shape of [n, y]
x_batch[i,:,:] = data[seq_idx, start_idx:start_idx+n, :]
# Each batch inside y_batch has a shape of [1, y] (as we predict only 1 time step in advance)
y_batch[i,:,:] = data[seq_idx, start_idx+n, :]
yield (x_batch, y_batch)
问题是,如果我使用的batch_size大于1,则会出现错误。有人可以帮助我设置这种数据,使其最佳地用于训练我的神经网络吗?
现在使用以下方法训练模型:
generator = BatchGenerator(batch_size, n, y, data)
model.fit_generator(generator = generator, steps_per_epoch = steps_per_epoch, epochs = epochs)
谢谢!