我正在尝试将CNN与注意力网络结合起来。我有一个非常不寻常的要求,或者可能是因为我没有正确理解keras功能。我有一个来自GitHub的源代码,正在尝试修改。下面是一段代码:-
texts_in = Input(shape=(MAX_SEQ_LEN,config.doc_size), dtype='int32')
attention_weighted_sentences = TimeDistributed(attention_weighted_sentence)(texts_in)
if rnn_type is 'GRU':
#sentence_encoder = Bidirectional(GRU(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.2))(attention_weighted_sentences)
dropout = Dropout(0.1)(attention_weighted_sentences)
sentence_encoder = Bidirectional(GRU(50, return_sequences=True))(dropout)
else:
sentence_encoder = Bidirectional(LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.2))(attention_weighted_sentences)
dense_transform_sentence = Dense(
100,
activation='relu',
name='dense_transform_sentence',
kernel_regularizer=l2_reg)(sentence_encoder)
# sentence attention
attention_weighted_text = Attention(name="sentence_attention")(dense_transform_sentence)
out = Dense(
data.documents.target_dim, init=my_init,
W_regularizer=W_regularizer(config),
activation='sigmoid'
)(attention_weighted_text)
# prediction = Dense(19, activation='sigmoid')(attention_weighted_text)
# texts_in = Reshape((500,))(texts_in)
model = Model(input = texts_in, output= out)
model.summary()
在输入层(即texts_in)中,MAX_SEQ_LEN的值为1,config.doc_size为500。之所以将以前的值保留为1是因为我想将输入重塑为(None,500),并且不想将输入更改为(None,1,500)。
但是,由于时间分布层不只接受2个暗淡作为输入,因此我必须用(1,500)对其进行初始化。无论如何,在传递给模型之前是否要重塑输入的形状。现在,如果在传递给模型之前重新调整输入的形状,则会出现图形断开错误,这是可以理解的。
但是,正如我所说,我只希望输入层的二维尺寸为None(无)和500。可以做到并将其传递给TimeDistributed Layer或稍后重塑吗?