我目前正在尝试在Tensorflow上使用Keras进行第一步,以对时间序列数据进行分类。我能够运行一个非常简单的模型,但是在获得一些反馈后,建议我连续使用多个GRU层,并在我的Dense层周围添加 TimeDistributed包装器 。这是我尝试的模型:
model = Sequential()
model.add(GRU(100, input_shape=(n_timesteps, n_features), return_sequences=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(TimeDistributed(Dense(units=100, activation='relu')))
model.add(TimeDistributed(Dense(n_outputs, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
当尝试使用形状为(2357,128,11)形状的输入拟合模型时,我收到以下错误消息:(2357个样本,128个时间步长,11个功能):
ValueError: Error when checking target: expected time_distributed_2 to have 3 dimensions, but got array with shape (2357, 5)
这是model.summary()
的输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
gru_1 (GRU) (None, 128, 100) 33600
_________________________________________________________________
gru_2 (GRU) (None, 128, 100) 60300
_________________________________________________________________
gru_3 (GRU) (None, 128, 100) 60300
_________________________________________________________________
gru_4 (GRU) (None, 128, 100) 60300
_________________________________________________________________
gru_5 (GRU) (None, 128, 100) 60300
_________________________________________________________________
gru_6 (GRU) (None, 128, 100) 60300
_________________________________________________________________
time_distributed_1 (TimeDist (None, 128, 100) 10100
_________________________________________________________________
time_distributed_2 (TimeDist (None, 128, 5) 505
=================================================================
Total params: 345,705
Trainable params: 345,705
Non-trainable params: 0
那么将多个GRU层连续放置并将TimeDistributed Wrapper添加到以下Dense层的正确方法是什么?我将非常感谢您提供任何有益的意见
答案 0 :(得分:0)
如果您在GRU的最后一层中设置了return_sequences = False
,该代码将起作用。
您只需要放置return_sequences = True
,以防RNN的输出再次馈送到RNN的输入,从而保留了时间维空间。设置return_sequences = False
时,这意味着输出将仅是最后一个隐藏状态(而不是每个时间步长的隐藏状态),并且时间维度将消失。
这就是为什么在设置return_sequnces = False
时,输出维数从N减少到N-1。