Question

我正在尝试使用lstm模型来预测天气（主要是为了了解lstm并使用python）。

我有一个500,000行的数据集，每个数据集代表一个日期，并且有8列是我的特征。

下面是我的模特。

 model = Sequential()      
 model.add(LSTM(50, input_shape=(30, 8), return_sequences=True))   
 model.add(Dropout(0.2))

 model.add(LSTM(100, return_sequences=True))
 model.add(Dropout(0.2))

 model.add(LSTM(50, return_sequences=False))
 model.add(Dropout(0.2))

 model.add(Dense(1))
 model.add(Activation('linear'))

 model.fit(
        X,
        y,
        batch_size=512,
        epochs=100,
        validation_split=0.05)

对于我所了解的输入参数，第一个参数是时间步长，所以在这里我说我认为应该将最近的30个观测值用于预测下一个值。据我了解，这8个功能就是气压，温度等。

所以我的X矩阵通过下面的行转换为3D矩阵，所以X现在是500000、8、1矩阵。

X = np.reshape(X, (X.shape[0], X.shape[1], 1))

尽管运行模型，但出现以下错误。

ValueError：检查输入时出错：预期lstm_3_input的形状为（30，8），但数组的形状为（8，1）

我在做什么错了？

Answer 1

您的问题与数据准备有关。查找有关LSTM here的数据准备的详细信息。

LSTM将过去的观察序列作为输入映射到输出观察。因此，必须将观察序列转换为多个样本，并考虑给定的单变量序列：

[10, 20, 30, 40, 50, 60, 70, 80, 90]

我们可以将序列分为多个称为样本的输入/输出模式，其中三个n_steps时间步长用作输入，一个时间步长用作正在学习的单步预测的标签。< / p>

X,              y
10, 20, 30      40
20, 30, 40      50
30, 40, 50      60
# ...

因此您要执行的操作在下面的split_sequence()函数中实现：

# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)

回到我们的初始示例，将发生以下情况：

# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# summarize the data
for i in range(len(X)):
    print(X[i], y[i])

# [10 20 30] 40
# [20 30 40] 50
# [30 40 50] 60
# [40 50 60] 70
# [50 60 70] 80
# [60 70 80] 90

收获：现在，您的形状应该是您的LSTM模型所期望的形状，并且您应该能够根据需要调整数据形状。显然，对于多个输入要素行，该方法也适用。

Answer 2

我认为您的输入形状已关闭。 NN无法理解您希望它采用30分的切片来预测第31位。您需要做的是将数据集切成长度为30的块（这意味着每个点将被复制29次）并对其进行训练，其形状为（499969，30，8），假设最后一个点仅进入y。另外，不要在末尾添加虚拟尺寸，在RGB通道的转换层中需要。

Answer 3

我认为您可能只需要简单说明层的工作原理。特别要注意的是，所有Keras层的行为都类似这样：

NAME(output_dim, input_shape = (...,input_dim))

例如，假设我有15000个3个长向量，我想将它们更改为5个长向量。然后像这样的事情就会做到：

import numpy as np, tensorflow as tf

X = np.random.random((15000,3))
Y = np.random.random((15000,5))

M = tf.keras.models.Sequential()
M.add(tf.keras.layers.Dense(5,input_shape=(3,)))

M.compile('sgd','mse')
M.fit(X,Y) # Take note that I provided complete working code here. Good practice. 
           # I even include the imports and random data to check that it works.

同样，如果我的输入看起来像（1000,10,5），我通过LSTM（7）之类的LSTM运行它；那么我应该知道（自动）我将得到类似（...，7）的输出。这5个长向量将变为7个长向量。规则要懂。最后一个尺寸始终是您要更改的矢量，并且图层的第一个参数始终是将其更改为的尺寸。

现在，第二件事是了解LSTM。他们使用时间轴（不是最后一个轴，因为正如我们刚才所说的，它始终是“不断变化的尺寸轴”），如果return_sequences = False，则将其删除；如果return_sequences = True，则将其保留。一些示例：

LSTM(7) # (10000,100,5) -> (10000,7)
# Here the LSTM will loop through the 100, 5 long vectors (like a time series with memory),
# producing 7 long vectors. Only the last 7 long vector is kept.

LSTM(7,return_sequences=True) # (10000,100,5) -> (10000,100,7)
# Same thing as the layer above, except we keep all the intermediate steps.

您提供的外观如下：

LSTM(50,input_shape=(30,8),return_sequences=True) # (10000,30,8) -> (10000,30,50)

请注意，30是LSTM模型中使用的TIME维度。 8和50是INPUT_DIM和OUTPUT_DIM，与时间轴无关。另一个常见的误解是，请注意LSTM希望您为每个样本提供其自己的“完整过去”和“时间轴”。也就是说，LSTM不会将先前的采样点用于下一个采样点。每个样本都是独立的，并带有自己完整的过去数据。

因此，让我们看一下您的模型。步骤1。您的模型在做什么，期望什么样的数据？

from tensorflow.keras.layers import LSTM, Dropout, Activation
from tensorflow.keras.models import Sequential

model = Sequential()      
model.add(LSTM(50, input_shape=(30, 8), return_sequences=True))   
model.add(Dropout(0.2))
model.add(LSTM(100, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(50, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('linear'))
model.compile('sgd','mse')

print(model.input_shape)
model.summary() # Lets see what your model is doing.

因此，现在我清楚地看到您的模型可以做到：（10000,30,8）->（10000,30,50）->（10000,30,100）->（10000,50）->（10000,1）

您期望吗？您是否看到这些就是中间步骤的尺寸？既然我知道您的模型期望什么输入和输出，我就可以轻松地验证您的模型能够训练并处理这种数据。

from tensorflow.keras.layers import LSTM, Dropout, Activation
from tensorflow.keras.models import Sequential
import numpy as np

X = np.random.random((10000,30,8))
Y = np.random.random((10000,1))

model = Sequential()      
model.add(LSTM(50, input_shape=(30, 8), return_sequences=True))   
model.add(Dropout(0.2))
model.add(LSTM(100, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(50, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('linear'))
model.compile('sgd','mse')

model.fit(X,Y)

您是否注意到模型期望输入（...，30,8）之类的信息？您知道模型期望输出的数据看起来像（...，1）吗？知道模型想要什么，也意味着您现在可以更改模型以适合您感兴趣的数据。如果您希望数据在时间轴等8个参数上运行，那么您的输入维就需要反映出来。将30更改为8，将8更改为1。如果要执行此操作，还请注意，您的第一层将每个1长矢量（单个数字）扩展为50长矢量。听起来像您要模型做的那样吗？也许您的LSTM应该是LSTM（2）或LSTM（5），而不是50 ... etc。您可能会在接下来的1000个小时中尝试找到适合您所使用数据的正确参数。

也许您不想在时间空间上浏览特征空间，也许尝试将数据重复成10个大小的批处理，每个样本都有自己的历史记录，维数为（10000,10,8）。然后LSTM（50）将使用8个长特征空间，并将其更改为50个长特征空间，同时使TIME AXIS超过10。也许您只想保留最后一个带有return_sequences = False的特征空间。

Answer 4

让我复制一个用于为LSTM准备数据的函数：

from itertools import islice

def slice_data_for_lstm(data, lookback):
    return np.array(list(zip(*[islice(np.array(data), i, None, 1) for i in range(lookback)])))

X_sliced = slice_data_for_lstm(X, 30)

在您的情况下，回溯应为30，并将创建30个（8，1）功能的堆栈。结果数据的形状为（N，30，8，1）。

keras lstm input_shape错误

4 个答案: