
时间:2020-05-20 15:58:10

标签: tensorflow keras deep-learning



import tensorflow as tf

from keras.layers import Layer, Dense


from tensorflow.keras.layers import Layer, Dense

class Attention(Layer):

  def __init__(self, units_att):

     self.units_att = units_att
     self.W = Dense(units_att)
     self.V = Dense(1)

  def __call__(self, values):

      t = tf.constant(0, dtype= tf.int32)    
      time_steps = tf.shape(values)[1]
      initial_outputs = tf.TensorArray(dtype=tf.float32, size=time_steps)
      initial_att =  tf.TensorArray(dtype=tf.float32, size=time_steps)

      def should_continue(t, *args):
          return t < time_steps

      def iteration(t, values, outputs, atts):

        score = self.V(tf.nn.tanh(self.W(values)))

        # attention_weights shape == (batch_size, time_step, 1)
        attention_weights = tf.nn.softmax(score, axis=1)

        # context_vector shape after sum == (batch_size, hidden_size)
        context_vector = attention_weights * values
        context_vector = tf.reduce_sum(context_vector, axis=1)

        outputs = outputs.write(t, context_vector)
        atts = atts.write(t, attention_weights)
        return t + 1, values, outputs, atts

      t, values, outputs, atts = tf.while_loop(should_continue, iteration,
                                  [t, values, initial_outputs, initial_att])

      outputs = outputs.stack()
      outputs = tf.transpose(outputs, [1,0,2])

      atts = atts.stack()
      atts = tf.squeeze(atts, -1)
      atts = tf.transpose(atts, [1,0,2])
      return t, values, outputs, atts

对于input= tf.constant(2, shape= [32, 100, 2048], dtype= tf.float32),我得到了 在tf2中输出为shape = [32,100,2048],在tf1中输出为[32,None, 2048]

对于输入input= Input(shape= (None, 2048)),我在tf1中得到了shape = [None, None, 2048]的输出,并且出现了错误

TypeError:“ Tensor”对象无法解释为整数


最后,在这两种情况下,我都无法在模型中使用该层,因为我的模型输入是Input(shape= (None, 2048))并收到错误

AttributeError:“ NoneType”对象没有属性“ _inbound_nodes”


1 个答案:

答案 0 :(得分:0)


        class Bahdanau(tf.keras.layers.Layer):
            def __init__(self, n):
                super(Bahdanau, self).__init__()
                self.w = tf.keras.layers.Dense(n)
                self.u = tf.keras.layers.Dense(n)
                self.v = tf.keras.layers.Dense(1)
            def call(self, query, values):
                query = tf.expand_dims(query, 1)
                e = self.v(tf.nn.tanh(self.w(query) + self.u(values)))
                a = tf.nn.softmax(e, axis=1)
                c = a * h
                c = tf.reduce_sum(c, axis=1)
                return a,c
        ##Say we want 10 units in the single layer MLP determining w,u
        attentionlayer = Bahdanau(10)
        ##Call with i/p: decoderstate @ t-1 and all encoder hidden states
        a, c = attentionlayer(stminus1, hj)

我们没有在代码中的任何地方指定张量形状。此代码将为您返回一个与“ stminus1”(即“查询”)大小相同的上下文张量。它是在使用Bahdanau的注意力机制处理所有“值”(解码器的所有输出状态)之后执行此操作的。

因此,假设您的批处理大小为32,时间步长= 100,嵌入维度= 2048,则stminus1的形状应为(32,2048),hj的形状应为(32,100,2048)。输出上下文的形状将为(32,2048)。我们还返回了100个注意权重,以防万一您希望将它们路由到一个漂亮的显示器上。
