Question

我已经在keras中看到了这种类型的层初始化

from keras.models import Model
from keras.layers import Input, Dense

a = Input(shape=(32,))
b = Dense(32)(a)
c = Dense(b)

它的c_th层的初始化令人困惑。我有一个像这样的类对象

class Attention(tf.keras.Model):
    def __init__(self, units):
        super(Attention, self).__init__()
        self.W1 = tf.keras.layers.Dense(units)
        self.W2 = tf.keras.layers.Dense(units)
        self.V = tf.keras.layers.Dense(1)

    def call(self, features, hidden):
        hidden_with_time_axis = tf.expand_dims(hidden, 1)
        score = tf.nn.tanh(self.W1(features) + self.W2(hidden_with_time_axis))
        attention_weights = tf.nn.softmax(self.V(score), axis=1)
        context_vector = attention_weights * features
        context_vector = tf.reduce_sum(context_vector, axis=1)

        return context_vector, attention_weights

查看self.W1(features)采取上一层的特征，并将其传递给具有x W1的已初始化权重units密集层。此步骤中发生了什么，为什么我们要这样做？

编辑：

class Foo:
    def __init__(self, units):
        self.units=units
    def __call__(self):
        print ('called '+self.units)


a=Foo(3)
b=Foo(a)

为什么我们需要调用一个函数？

Answer 1

初始化和调用层之间有区别。

b = Dense(32)(a)初始化具有32个隐藏单元的密集层，然后立即在输入a上调用此层。为此，您需要了解Python中可调用对象的概念。基本上，任何定义了__call__函数（基于keras基Layer类的对象）都可以在输入上调用，即像函数一样使用。

c = Dense(b)最肯定行不通，如果您确实在某个教程或某段代码中确实看到了这一点，将来我会避免使用该源……这将尝试使用b个单位，如果b是另一个密集层的输出则没有意义。最有可能的是，您看到的实际上都是c = Dense(n_units)(b)之类的东西。

话虽这么说，Attention代码段中发生的所有事情是在先前在self.W1中初始化之后，在features上调用了__init__层（与W2相同）。 {1}}。

不同的Keras层初始化背后的原因是什么？

1 个答案: