Keras中的自定义注意层

时间:2018-12-19 13:01:33

标签: python keras rnn attention-model

我正在研究一个问题,该问题有成对的问题和答案,以及一个标签(0,1)表示答案是否与该问题相关。对于每个问题,我有9个答案带有标签0,只有1个答案带有标签1。

我正在尝试在Keras中实现一个基于关注的自定义循环网络,以将问题信息合并到答案表示中。该实现基于本文“ R-NET:机器阅读理解与 自匹配网络”。链接:https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf 3.2节包含注意模型的详细信息。

我是keras的新手,一直在努力使用此代码。该代码也可能包含其他错误。您的帮助将不胜感激。

注意力模型的代码如下:

from keras.models import Model
from keras import layers
from keras.layers import InputSpec
from keras import activations, initializers
from keras.engine.topology import Layer
from keras import Input
from keras.optimizers import RMSprop
from keras.layers.recurrent import GRU,LSTM,RNN,GRUCell

class QA_AttentionGRU(GRU):

    def __init__(self,**kwargs):

        super().__init__(**kwargs)

        self.input_spec =[InputSpec(ndim=3), InputSpec(ndim=3)]

    def build(self,input_shape):

        if not (isinstance(input_shape, list) and len(input_shape) == 2):
            raise Exception('Input must be a list of '
                            'two tensors [lstm_input, attn_input].')
        print("Input shape",input_shape)


        self.step_input_shape=self.units+input_shape[-1][-1]
        print(self.step_input_shape)
        self.input_length_custom= input_shape[0][1]

        super().build(input_shape[0])

        self.Wqu= self.add_weight(shape=(input_shape[-1][-1], self.units),
                                  initializer='uniform',
                                  name='Wqu')
        self.Wpu= self.add_weight(shape=(input_shape[0][-1], self.units),
                                      initializer='uniform',
                                      name='Wpu')

        self.Wpv= self.add_weight(shape=(self.units, self.units),
                                      initializer='uniform',
                                      name='Wpv')

        self.v= self.add_weight(shape=(self.units, 1),
                                      initializer='uniform',
                                      name='v')
        #self.trainable_weights+=[self.Wqu,self.Wpu,self.Wpv,self.v]

        step_input_shape=self.units+input_shape[-1][-1]

        self.built=True

    def call(self, inputs,training=None):

        uQ = inputs[1:]
        uP = inputs[:1]
        self._constants=uQ[0]

        initial_states=self.get_initial_state(uP[0])
        print("Initial state",initial_states)
        last_output, outputs, states = K.rnn(self.step, 
                                             uP[0],
                                             initial_states,
                                             input_length=self.input_length_custom)

        return outputs


    def step(self,inputs,states):

        uP_t = inputs
        vP_tm1 = states[0]
        uQ=self._constants

        WQ_u_Dot = K.dot(self._constants, self.Wqu) 
        WP_v_Dot = K.dot(K.expand_dims(vP_tm1, axis=1), self.Wpv) #WP_v
        WP_u_Dot = K.dot(K.expand_dims(uP_t, axis=1), self.Wpu) # WP_u

        s_t_hat = K.tanh(WQ_u_Dot + WP_v_Dot + WP_u_Dot)
        s_t = K.dot(s_t_hat, self.v) 
        s_t = K.batch_flatten(s_t)
        uQ_mask=None
        a_t = softmax(s_t, mask=uQ_mask, axis=1)
        c_t = K.batch_dot(a_t, uQ, axes=[1, 1])

        GRU_inputs = K.concatenate([uP_t, c_t])

        vP_t, s = super().step(GRU_inputs, states)

        return vP_t, s


attgru=QA_AttentionGRU(units=32,return_sequences=True)

uq=K.random_normal(shape=(5,2,32))
up=K.random_normal(shape=(5,8,16))
attgru(inputs=[uq,up])
    `

该代码引发以下错误:Error_Pic1 Error_pic2

0 个答案:

没有答案