我有一个有效的用于文本分类的分层注意力网络,该网络使用由word2vec嵌入构成的嵌入层。我想尝试使用FastText模型处理数据,然后再将其输入模型。这样我就可以为每个单词进行嵌入,而不必将单词的数量限制为嵌入层的尺寸。因此,我尝试删除嵌入层并改为输入3维数据(单词,句子,嵌入)。嵌入矩阵的模型如下:
def han2(MAX_NB_WORDS, MAX_WORDS, MAX_SENTS, EMBEDDING_DIM, WORDGRU, embedding_matrix, DROPOUTPER):
wordInputs = Input(shape=(MAX_WORDS,), dtype='float32')
wordEmbedding = embedding_layer(wordInputs)
hij = Bidirectional(GRU(WORDGRU, return_sequences=True), name='gru1')(wordEmbedding)
Si = Attention(name='att1')(hij)
wordEncoder = Model(wordInputs, Si)
# -----------------------------------------------------------------------------------------------
docInputs = Input(shape=(MAX_SENTS,MAX_WORDS), dtype='float32')
sentenceMasking = Masking(mask_value=0.0, name='sentenceMasking')(docInputs)
sentEncoding = TimeDistributed(wordEncoder, name='sentEncoding')(sentenceMasking)
hi = Bidirectional(GRU(WORDGRU, return_sequences=True), merge_mode='concat', name='gru2')(sentEncoding)
Vb = Attention(name='att2')(hi)
v6 = Dense(8, activation="softmax", kernel_initializer = 'he_normal', name="dense")(Vb)
model = Model(inputs=[docInputs] , outputs=[v6])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model, wordEncoder
修改后的模型如下:
def han2(MAX_WORDS, MAX_SENTS, EMBEDDING_DIM, WORDGRU, DROPOUTPER):
wordEmbedding = Input(shape=(MAX_WORDS,EMBEDDING_DIM,), dtype='float32')
hij = Bidirectional(GRU(WORDGRU, return_sequences=True), name='gru1')(wordEmbedding)
Si = Attention(name='att1')(hij)
wordEncoder = Model(wordEmbedding, Si)
# -----------------------------------------------------------------------------------------------
docInputs = Input(shape=(MAX_SENTS,MAX_WORDS,EMBEDDING_DIM,), dtype='float32')
sentenceMasking = Masking(mask_value=0.0, name='sentenceMasking')(docInputs)
sentEncoding = TimeDistributed(wordEncoder, name='sentEncoding')(sentenceMasking)
hi = Bidirectional(GRU(WORDGRU, return_sequences=True), merge_mode='concat', name='gru2')(sentEncoding)
Vb = Attention(name='att2')(hi)
v6 = Dense(2, activation="sigmoid", kernel_initializer = 'he_normal', name="dense")(Vb)
model = Model(inputs=[docInputs] , outputs=[v6])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model, wordEncoder
但是出现以下错误:
TypeError Traceback (most recent call last)
<ipython-input-40-2fd9fabaf093> in <module>()
2 from keras import regularizers, constraints, optimizers
3
----> 4 model, model1 = han2(MAX_SENT_LENGTH, MAX_SENTS, EMBEDDING_DIM, 100, 0.2)
5 model.summary()
<ipython-input-39-cdaf09f1852c> in han2(MAX_WORDS, MAX_SENTS, EMBEDDING_DIM, WORDGRU, DROPOUTPER)
19 #print(sentenceMasking.shape)
20
---> 21 sentEncoding = TimeDistributed(wordEncoder, name='sentEncoding')(docInputs)#(sentenceMasking)
22
23 #print(sentenceEncoding.shape)
~/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
617
618 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 619 output = self.call(inputs, **kwargs)
620 output_mask = self.compute_mask(inputs, previous_mask)
621
~/anaconda3/lib/python3.6/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
215 uses_learning_phase = y._uses_learning_phase
216 # Shape: (num_samples, timesteps, ...)
--> 217 output_shape = self.compute_output_shape(input_shape)
218 y = K.reshape(y, (-1, input_length) + output_shape[2:])
219
~/anaconda3/lib/python3.6/site-packages/keras/layers/wrappers.py in compute_output_shape(self, input_shape)
174 child_output_shape = self.layer.compute_output_shape(child_input_shape)
175 timesteps = input_shape[1]
--> 176 return (child_output_shape[0], timesteps) + child_output_shape[1:]
177
178 def call(self, inputs, training=None, mask=None):
TypeError: can only concatenate tuple (not "list") to tuple
我似乎找不到问题。我已经打印了不同图层的形状,但无法获得单词编码器的形状。所以我找不到问题。
编辑: 显然,该错误来自TimeDistributed,如此处所述,不允许多维输入: https://github.com/keras-team/keras/issues/3057 我不知道如何修改代码。