我正在用Keras构建LSTM预测子。我的输入数组是历史价格数据。我将数据分成window_size
个块,以便预测前面的prediction length
个块。我的数据是4246个浮点数的列表。我将数据分成4055个数组,每个数组的长度为168,以便预测前面的24个单位。
这给了我一个x_train
且维度为(4055,168)
的集合。然后,我缩放数据并尝试拟合数据,但遇到尺寸错误。
df = pd.DataFrame(data)
print(f"Len of df: {len(df)}")
min_max_scaler = MinMaxScaler()
H = 24
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)]
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)]
y_train.append(y_train_block)
LEN = int(len(x_train)*window_size)
x_train = min_max_scaler.fit_transform(x_train)
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(window_size,batch_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model= build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
返回的错误就是这样。
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 4055 arrays: [array([[0.00630006],
我无法正确细分吗?正确加载?单元数应该与预测块数不同吗?感谢您的帮助。谢谢。
将它们转换为Numpy数组的建议是正确的,但是 MinMixScalar()返回一个numpy数组。我将数组重塑为正确的尺寸,但是现在我的计算机出现CUDA内存错误。我认为问题已经解决。谢谢。
df = pd.DataFrame(data)
min_max_scaler = MinMaxScaler()
H = prediction_length
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)].values
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)].values
y_train.append(y_train_block)
x_train = min_max_scaler.fit_transform(x_train)
y_train = min_max_scaler.fit_transform(y_train)
x_train = np.reshape(x_train, (len(x_train), 1, window_size))
y_train = np.reshape(y_train, (len(y_train), 1, H))
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=100))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model = build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
答案 0 :(得分:1)
我不认为您在模型中通过了批次大小。
input_shape=(window_size,batch_size)
是数据维度。正确,但是您应该使用input_shape=(window_size, 1)
如果要使用批处理,则必须添加另一个尺寸,例如LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]))
(引用自Keras)
在您的情况下:
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
您还需要使用np.shape
来更改数据的维度,它的大小应为(batch_dim
,data_dim_1
,data_dim_2
)。我使用numpy
,所以numpy.reshape()
可以工作。
首先,您的数据应按行排列,因此,对于每一行,您都应具有(1, 168)
的形状,然后添加批处理维度,它将为(batch_n, 1, 168)
。
希望获得帮助。
答案 1 :(得分:1)
这可能是因为x_train
和y_train
未更新为numpy数组。仔细看看github上的issue。
model = build_model()
x_train, y_train = np.array(x_train), np.array(y_train)
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)