Tensorflow 2 LSTM模型无法使用序列学习

时间:2020-06-17 09:49:52

标签: python tensorflow keras

我目前正在使用LSTM模型通过Tensorflow 2.2.0进行时间预测

我一直在使用大型数据集,并且一切正常。 但是,数据集的创建需要大量RAM,因此我想使用tensorflow.keras.utils.Sequence来解决问题,我的问题是:

使用序列时,我的模型不再学习(它可以预测整个数据集中真实信号的平均值)

我的数据集是由两个Python列表x_train_flightsy_train_flights创建的,每个列表都包含熊猫DataFrame。对于此列表的每个(x_train_flight, y_train_flight)

  • 形状为x_train_flight的{​​{1}},包含(-1, features)个信号
  • 形状为features
  • y_train_flight,其中一个信号与(-1, 1)的信号在时间上对齐

系统如下所示(不允许共享真实数据,而是使用伪随机信号重新创建了图):

exemple of prediction

在这里,x_train_flights(蓝色和橙色线)和features=2。也就是说,矩形中的10个点(来自look_back=5)用于预测黄金点(与训练阶段x_train_flights中的对应点进行比较)。灰点是先前的预测。

要创建我的数据集,我一直在使用以下功能:

y_train_flights

我的网络适合以下条件:

def lstm_shapify(sequence, look_back, features):
    res = np.empty((look_back, len(sequence), features), dtype=np.float32)

    for i in range(look_back):
        res[i] = np.roll(sequence, -i * features)

    return np.transpose(res, axes=(1, 0, 2))[:-look_back + 1]


def make_dataset(x_flights, y_flights, look_back, features):
    x = np.empty((0, look_back, features), dtype=np.float32)
    y = np.empty((0, 1), dtype=np.float32)

    for i in range(len(x_flights)):
        x_sample = x_flights[i].values
        y_sample = y_flights[i].values[look_back - 1:]

        x = np.concatenate([x, lstm_shapify(x_sample, look_back, features)])
        y = np.concatenate([y, y_sample])

    return x, y

因此,我创建了此自定义序列:

model.fit(
    x_train,
    y_train,
    epochs=7,
    batch_size=batch_size
)

它返回的每个元组(class LSTMGenerator(Sequence): def __init__( self, x_flights: List[DataFrame], y_flights: List[DataFrame], look_back: int, batch_size: int, features: int ): self.x_flights = x_flights self.y_flights = [] self.look_back = look_back self.batch_size = batch_size self.features = features self.length = 0 for y_flight in y_flights: y = y_flight.iloc[look_back - 1:].to_numpy() self.y_flights.append(y) self.length += len(y) // batch_size def __getitem__(self, index): flight_index = 0 while True: n = len(self.y_flights[flight_index]) // self.batch_size if index < n: break flight_index += 1 index = index - n start_index = index * self.batch_size x_batch = lstm_shapify( self.x_flights[flight_index] .iloc[start_index:start_index + self.batch_size + self.look_back - 1] .to_numpy(), self.look_back, self.features ) y_batch = self.y_flights[flight_index][start_index:start_index + self.batch_size] return x_batch, y_batch def __len__(self): return self.length x)分别是两个分别为形状y(batch_size, look_back, features)的numpy数组。

现在我正在尝试使其适合:

(batch_size, 1)

这是我的模特:

model.fit(
    LSTMGenerator(x_train_flights, y_train_flights, look_back, batch_size, features),
    epochs=epochs
)

希望你能帮助我

编辑:有关数据集的更多详细信息

2 个答案:

答案 0 :(得分:0)

这是一个可行的示例:

from tensorflow.keras import *
from tensorflow.keras.layers import *
from tensorflow.keras.utils import *
import numpy as np
import tensorflow as tf

np.random.seed(1234)
tf.random.set_seed(1234)

features = 3
lookback = 7

model = Sequential()
model.add(LSTM(500, input_shape = (lookback, features)))
model.add(Dense(1, activation='tanh'))

XS = np.random.randn(200, features)
YS = np.random.randn(200)

class LookbackSeq(Sequence):
    def __init__(self, XS, YS, batch_size, lookback):
        self.XS = XS
        self.YS = YS
        self.batch_size = batch_size
        self.lookback = lookback

    def __len__(self):
        n_windows = self.XS.shape[0] - self.lookback
        return int(np.ceil(n_windows / self.batch_size))

    def __getitem__(self, i):
        base = i * self.batch_size

        n_windows = self.XS.shape[0] - self.lookback
        batch_size = min(n_windows - base, self.batch_size)

        X = np.zeros((batch_size, self.lookback, self.XS.shape[1]))
        Y = np.zeros((batch_size, 1))
        for i in range(batch_size):
            for j in range(self.lookback):
                X[i, j] = self.XS[base + i + j]
            Y[i] = self.YS[base + i + self.lookback]
        return X, Y

model.compile(optimizer='adam', loss='mse')

# ALL SAMPLES IN MEMORY
X, Y = [], []
for i in range(len(XS) - lookback):
    X.append(XS[i:i+lookback])
    Y.append(YS[i+lookback])
X, Y = np.array(X), np.array(Y)
model.fit(X, Y, epochs = 10, batch_size = 4, shuffle = False)

# GENERATED ON THE FLY
# gen = LookbackSeq(XS, YS, 4, lookback)
# model.fit(x = gen,
#           steps_per_epoch = len(gen),
#           shuffle = False,
#           epochs = 10)

我假设您的输入数据的形状为X = (n_points, n_features)Y = (n_points,)LookbackSeq为您进行批处理和加窗(回溯)。

您可以注释和取消注释相关行,以训练动态生成的样本或将其全部存储在内存中。您应该得到相同的结果。

答案 1 :(得分:0)

我通过休息一下并再次查看代码来解决了这个问题(我意识到这是一个愚蠢的错误):Sequence的问题来自于每批中的样本是时间上连续的样本,而我的所有计算数据集的批次都经过了很好的改组。

我的Sequence有问题,因为这些批次是从随机数据集中以随机索引选择的。现在,我从随机数据集中的随机索引中选择每个样本,以创建一个批次。