我正在尝试对时间序列数据使用LSTM模型。我正在使用的数据的特定背景是Twitter情绪分析,用于未来价格预测。我的数据如下所示:
date mentions likes retweets polarity count Volume Close
2017-04-10 0.24 0.123 -0.58 0.211 0.58 0.98 0.87
2017-04-11 -0.56 0.532 0.77 0.231 -0.23 0.42 0.92
.
.
.
2019-01-10 0.23 0.356 -0.21 -0.682 0.23 -0.12 -0.23
数据是大小(608,8),我计划使用的功能是第2至7列,而我预测的目标是Close
(即第8列)。我知道LSTM模型要求输入必须为3D张量的形状,因此我进行了一些操作来转换和重塑数据:
x = np.asarray(data.iloc[:, 1:8])
y = np.asarray(data.iloc[:. 8])
x = x.reshape(x.shape[0], 1, x.shape[1])
然后,我尝试像这样训练LSTM模型:
batch_size = 200
model = Sequential()
model.add(LSTM(batch_size, input_dim=3, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='mean_squared_error',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=15)
运行此模型会给我一个
ValueError: Error when checking input: expected lstm_10_input to have
shape (None, 3) but got array with shape (1, 10)
有人知道我错了吗?是我准备数据的方式,还是训练模型错误?
我一直在阅读有关该社区以及文章/博客的许多相关问题,但是我仍然在寻找解决方案方面遇到困难...感谢您的任何帮助,谢谢!
答案 0 :(得分:1)
x的形状应为形状(batch_size, timesteps, input_dim)
LSTM的第一个参数不是批处理大小,而是输出大小
示例:
df = pd.DataFrame(np.random.randn(100,9))
x_train = df.iloc[:,1:8].values
y_train = df.iloc[:,8].values
# No:of sample, times_steps, input_size (1 in your case)
x_train = x_train.reshape(x_train.shape[0],x_train.shape[1], 1)
model = Sequential()
# 16 outputs of first LSTM per time step
model.add(LSTM(16, input_dim=1, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(8, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(4, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='mean_squared_error',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=15, batch_size=32)