我正在使用Keras训练LSTM模型,并想在其上添加注意力。我是Keras的新人,注意力不集中。从链接How to add an attention mechanism in keras?我学到了如何在我的LSTM图层上添加注意力并创建了这样的模型
print('Defining a Simple Keras Model...')
lstm_model=Sequential() # or Graph
lstm_model.add(Embedding(output_dim=300,input_dim=n_symbols,mask_zero=True,
weights=[embedding_weights],input_length=input_length))
# Adding Input Length
lstm_model.add(Bidirectional(LSTM(300)))
lstm_model.add(Dropout(0.3))
lstm_model.add(Dense(1,activation='sigmoid'))
# compute importance for each step
attention=Dense(1, activation='tanh')
attention=Flatten()
attention=Activation('softmax')
attention=RepeatVector(64)
attention=Permute([2, 1])
sent_representation=keras.layers.Add()([lstm_model,attention])
sent_representation=Lambda(lambda xin: K.sum(xin, axis=-2),output_shape=(64))(sent_representation)
sent_representation.add(Dense(1,activation='sigmoid'))
rms_prop=RMSprop(lr=0.001,rho=0.9,epsilon=None,decay=0.0)
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
print('Compiling the Model...')
sent_representation.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])
#class_mode='binary')
earlyStopping=EarlyStopping(monitor='val_loss',min_delta=0,patience=0,
verbose=0,mode='auto')
print("Train...")
sent_representation.fit(X_train, y_train,batch_size=batch_size,nb_epoch=20,
validation_data=(X_test,y_test),callbacks=[earlyStopping])
输出将是0/1的情绪分析。为此,我添加了一个
sent_representation.add(Dense(1,activation='sigmoid'))
给它一个二进制结果。
这是我们在运行代码时遇到的错误:
ERROR:
File "<ipython-input-6-50a1a221497d>", line 18, in <module>
sent_representation=keras.layers.Add()([lstm_model,attention])
File "C:\Users\DuttaHritwik\Anaconda3\lib\site-packages\keras\engine\topology.py", line 575, in __call__
self.assert_input_compatibility(inputs)
File "C:\Users\DuttaHritwik\Anaconda3\lib\site-packages\keras\engine\topology.py", line 448, in assert_input_compatibility
str(inputs) + '. All inputs to the layer '
ValueError: Layer add_1 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.models.Sequential'>. Full input: [<keras.models.Sequential object at 0x00000220B565ED30>, <keras.layers.core.Permute object at 0x00000220FE853978>]. All inputs to the layer should be tensors.
你能看看我们在这里做错了吗?
答案 0 :(得分:1)
keras.layers.Add()
需要张量,所以
sent_representation=keras.layers.Add()([lstm_model,attention])
您将顺序模型作为输入传递并且收到错误。 将初始层从使用Sequential模型更改为使用功能api。
lstm_section = Embedding(output_dim=300,input_dim=n_symbols,mask_zero=True, weights=[embedding_weights],input_length=input_length)( input )
lstm_section = Bidirectional(LSTM(300)) ( lstm_section )
lstm_section = Dropout(0.3)( lstm_section )
lstm_section = Dense(1,activation='sigmoid')( lstm_section )
lstm_section
是一个张量,可以在Add()调用中替换lstm_model
。
由于您使用的是功能API而非Sequential,因此您还需要使用功能API创建模型
your_model = keras.models.Model( inputs, sent_representation )
另外值得注意的是,您提供的链接中的注意力模型会增加而不是相加,因此可能值得使用keras.layers.Multiply()
。
修改强>
注意到您的关注部分也没有构建图表,因为您没有将每个图层传递到下一个图层。它应该是:
attention=Dense(1, activation='tanh')( lstm_section )
attention=Flatten()( attention )
attention=Activation('softmax')( attention )
attention=RepeatVector(64)( attention )
attention=Permute([2, 1])( attention )