训练LSTM比GRU快,但不是吗?

时间:2020-05-23 09:00:51

标签: python performance tensorflow keras lstm

我已经实现了简单的LSTM和GRU网络以进行时间序列预测:

def LSTM1(T0, tau0, tau1, optimizer):
    model = Sequential()
    model.add(Input(shape=(T0,tau0), dtype="float32", name="Input"))
    model.add(LSTM(units=tau1, activation="tanh", recurrent_activation="tanh", name="LSTM1"))
    model.add(Dense(units=1, activation="exponential", name="Output"))
    model.compile(optimizer=optimizer, loss="mse")
    return model

def GRU1(T0, tau0, tau1, optimizer):
    model = Sequential()
    model.add(Input(shape=(T0,tau0), dtype="float32", name="Input"))
    model.add(GRU(units=tau1, activation="tanh", recurrent_activation="tanh", reset_after=False, name="GRU1"))
    model.add(Dense(units=1, activation="exponential", name="Output"))
    model.compile(optimizer=optimizer, loss="mse")
return model

与LRU模型相比,LSTM模型具有明显更多的参数:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
LSTM1 (LSTM)                 (None, 5)                 180       
_________________________________________________________________
Output (Dense)               (None, 1)                 6         
=================================================================
Total params: 186
Trainable params: 186
Non-trainable params: 0
_________________________________________________________________

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
GRU1 (GRU)                   (None, 5)                 135      
_________________________________________________________________
Output (Dense)               (None, 1)                 6         
=================================================================
Total params: 141
Trainable params: 141
Non-trainable params: 0
_________________________________________________________________

因此,我希望训练GRU模型将花费更少的时间。

T0        = 10     # lookback period
tau0      = 3      # dimension of x_t 
tau1      = 5      # dimension of the outputs first RNN layer
optimizer = "Adam"

# Create model
model_gru1 = GRU1(T0, tau0, tau1, optimizer)
model_lstm1 = LSTM1(T0, tau0, tau1, optimizer)

但是,请接受以下训练数据:

x_train = np.random.rand(100,T0,tau0)
x_valid = np.random.rand(100,T0,tau0)
y_train = np.random.rand(100)
y_valid = np.random.rand(100)

并训练我的模型

# Train LSTM1 model
tf.random.set_seed(32)

start = timer()
model_lstm1.fit(x=x_train, y=y_train, 
          validation_data=(x_valid,y_valid), 
          verbose=1, 
          batch_size=10, epochs=500
         )
end = timer()
time_lstm1 = round(end-start,0)


# Train GRU1 model
tf.random.set_seed(32)

start = timer()
model_gru1.fit(x=x_train, y=y_train, 
          validation_data=(x_valid,y_valid), 
          verbose=1, 
          batch_size=10, epochs=500
         )
end = timer()
time_gru1 = round(end-start,0)

LSTM需要更少的时间:

print("training time GRU1 {} vs. training time LSTM1 {}".format(time_gru1,time_lstm1))

training time GRU1 80.0 vs. training time LSTM1 62.0

我在CPU上使用Tensorflow版本2.0.0。

有什么想法吗?

0 个答案:

没有答案