Keras: How to shape 2-D data (time, feature vector) into 3D

时间:2017-04-10 00:56:48

标签: python keras lstm

I have a training data with length 8474, and each element being a 5-dimensional feature vector at a discrete time. I am trying to run an LSTM in Keras:

 x_training, x_testing = x_data[:8475], x_data[8475:]
 y_training, y_testing = y_data[:8475], y_data[8475:]

 primary = Sequential()
 primary.add(LSTM(4,input_shape=(5,)))
 primary.add(LSTM(4, activation='sigmoid'))
 primary.add(Dense(1))

 primary.compile(optimizer='rmsprop', 
                 loss='binary_crossentropy', 
                 metrics=['accuracy'])

 primary.fit(x_training, y_training, batch_size=20, epochs=10, shuffle=False)
 score, accuracy = primary.evaluate(x_testing, y_testing, batch_size=20, verbose=0)

And:

ValueError: Input 0 is incompatible with layer lstm_4: expected ndim=3, found ndim=2

I know that I have to convert this 8475 X 5 data into 3D data with the setup (nb_samples, nb_included_previous_days, features), but I do not understand: What is the difference between the timestep and length of training data? Am I missing something else?

1 个答案:

答案 0 :(得分:1)

  

训练数据的时间步长和长度有什么区别?我错过了别的什么吗?

Timestep是模型中RNN / LSTM细胞的数量,取决于您的序列长度。

首先使用LSTM,您需要以3D格式转换训练数据。假设您正在处理一些时间序列问题并且预测训练数据中的每个瞬间,您认为前一个/相邻的10个训练瞬间很重要。在这种情况下,您的每个训练瞬间都将具有[10, num of feature in each training sample(5 in this case)]的形状。所以我猜你需要很少的修改来创建新的训练数据,其中每个瞬间都是所需训练样本的序列矩阵。

训练数据的形状为[number of training samples(8074), seq_length(10), num_features(5)]

将LSTM单元格中的输入形状更改为[sequence_length, num_features],即(10,5)。

这只是我对概念的有限理解,希望这有效。