请考虑以下两种模型:
from tensorflow.python.keras.layers import Input, GRU, Dense, TimeDistributed
from tensorflow.python.keras.models import Model
inputs = Input(batch_shape=(None, None, 100))
gru_out = GRU(32, return_sequences=True)(inputs)
dense = Dense(200, activation='softmax')
decoder_pred = TimeDistributed(dense)(gru_out)
model = Model(inputs=inputs, outputs=decoder_pred)
model.summary()
输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, None, 100) 0
_________________________________________________________________
gru (GRU) (None, None, 32) 12768
_________________________________________________________________
time_distributed (TimeDistri (None, None, 200) 6600
=================================================================
Total params: 19,368
Trainable params: 19,368
Non-trainable params: 0
_________________________________________________________________
第二种模式:
from tensorflow.python.keras.layers import Input, GRU, Dense
from tensorflow.python.keras.models import Model
inputs = Input(batch_shape=(None, None, 100))
gru_out = GRU(32, return_sequences=True)(inputs)
decoder_pred = Dense(200, activation='softmax')(gru_out)
model = Model(inputs=inputs, outputs=decoder_pred)
model.summary()
输出:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, None, 100) 0
_________________________________________________________________
gru_1 (GRU) (None, None, 32) 12768
_________________________________________________________________
dense_1 (Dense) (None, None, 200) 6600
=================================================================
Total params: 19,368
Trainable params: 19,368
Non-trainable params: 0
_________________________________________________________________
我的问题是TimeDistributed
层包装器对第一个模型有什么作用吗?这两个方面在任何方面是否都不同(考虑到它们的参数总数相同)?