在Keras LSTM网络中,如何使批处理大小大于1?

时间:2020-01-22 00:45:32

标签: python tensorflow keras lstm

我一直在用Keras训练LSTM,批处理大小为1,并且运行非常缓慢。我想增加批次大小以加快培训时间,但我不知道该怎么做。

下面是显示我的问题的代码(我的最小值,可重复的示例)。批量大小为1时可以使用,但是如何使用批量大小为2?

import pandas as pd
import numpy as np

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import TimeDistributed

#the demo data contains 10 instances each with 7 features and 3 targets
#the features are 0 or 1, the targets are the sum of the features in binary (with 3 bits)
raw_data = [[1,1,1,0,0,0,1,1,0,0], #has features 1,1,1,0,0,0,1, which contains 4 (100 in binary) 1's
            [0,0,0,0,0,0,0,0,0,0],[0,1,1,0,0,0,0,0,1,0],[1,0,0,1,1,1,1,1,0,1],[0,0,0,0,0,1,0,0,0,1],
            [1,1,1,1,1,1,0,1,1,0],[1,1,1,0,0,0,0,0,1,1],[1,1,1,1,1,1,1,1,1,1],[0,1,0,0,1,0,1,0,1,1],
            [1,1,1,1,0,1,1,1,1,0]]

#how can I use a batch_size of 2?
batch_size = 1
epochs = 10

df = pd.DataFrame(raw_data)
train_x, train_y =  df.values[:,0:-3], df.values[:,-3:]

#reshape <batch_size, time_steps, seq_len>  https://mc.ai/understanding-input-and-output-shape-in-lstm-keras/
#but can't reshape to a batch size of 2 as I get the ValueError below,
#ValueError: cannot reshape array of size 70 into shape (2,10,7) - which makes sense
#if I remove the batch_size from the reshape I get the ValueError below,
#ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (10, 7)
train_x = np.reshape(train_x, (batch_size, train_x.shape[0], train_x.shape[1]))
train_y = np.reshape(train_y, (batch_size, train_y.shape[0], train_y.shape[1]))

model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, 10, 7), return_sequences=True, units=7))
model.add(TimeDistributed(Dense(activation='linear', units=3)))
model.compile(loss='mse', optimizer='rmsprop')

#training and testing on the same data here, but it's only example code to demonstrate my batch_size problem
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, validation_data=(train_x, train_y))

yhat = model.predict(train_x, batch_size)
print(yhat)

如果仅使用Dense层训练模型,就可以将训练/测试数据保持为2维,并且可以为我处理批次大小。但是,LSTM需要3维。我是否需要手动创建批处理以通过执行类似的操作呈现给模型,

train_x = np.reshape(train_x, (batch_size, int(train_x.shape[0]/batch_size), train_x.shape[1]))
train_y = np.reshape(train_y, (batch_size, int(train_y.shape[0]/batch_size), train_y.shape[1]))
...

model.add(LSTM(batch_input_shape=(batch_size, int(train_x.shape[0]/batch_size), 7), return_sequences=True, units=7))

但这给出了ValueError

ValueError: Error when checking input: expected lstm_1_input to have shape (1, 7) but got array with shape (5, 7)

应用model.fit

在上面的最小的,可复制的示例中,如何修改它以使用2的批量大小?

1 个答案:

答案 0 :(得分:2)

仅在batch_size中使用model.fit。以下代码对我有用:

import pandas as pd
import numpy as np

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import TimeDistributed

#the demo data contains 10 instances each with 7 features and 3 targets
#the features are 0 or 1, the targets are the sum of the features in binary (with 3 bits)
raw_data = [[1,1,1,0,0,0,1,1,0,0], #has features 1,1,1,0,0,0,1, which contains 4 (100 in binary) 1's
            [0,0,0,0,0,0,0,0,0,0],[0,1,1,0,0,0,0,0,1,0],[1,0,0,1,1,1,1,1,0,1],[0,0,0,0,0,1,0,0,0,1],
            [1,1,1,1,1,1,0,1,1,0],[1,1,1,0,0,0,0,0,1,1],[1,1,1,1,1,1,1,1,1,1],[0,1,0,0,1,0,1,0,1,1],
            [1,1,1,1,0,1,1,1,1,0]]

#how can I use a batch_size of 2?
batch_size = 2
epochs = 10

df = pd.DataFrame(raw_data)
train_x, train_y =  df.values[:,0:-3], df.values[:,-3:]

#reshape <batch_size, time_steps, seq_len>  https://mc.ai/understanding-input-and-output-shape-in-lstm-keras/
#but can't reshape to a batch size of 2 as I get the ValueError below,
#ValueError: cannot reshape array of size 70 into shape (2,10,7) - which makes sense
#if I remove the batch_size from the reshape I get the ValueError below,
#ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (10, 7)
train_x = np.reshape(train_x, (1, train_x.shape[0], train_x.shape[1]))
train_y = np.reshape(train_y, (1, train_y.shape[0], train_y.shape[1]))

model = Sequential()
model.add(LSTM(batch_input_shape=(1, 10, 7), return_sequences=True, units=7))
model.add(TimeDistributed(Dense(activation='linear', units=3)))
model.compile(loss='mse', optimizer='rmsprop')

#training and testing on the same data here, but it's only example code to demonstrate my batch_size problem
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, validation_data=(train_x, train_y))

yhat = model.predict(train_x, 1)
print(yhat)