在编写自定义图层时,我需要将多个权重矩阵串联在一起。如果我在build
函数中执行此操作,则会收到错误ValueError: No gradients provided for any variable:...
,但是,如果我在build
中列出了权人列表,并在call
中将其串联起来,则它会起作用。这是产生错误的最少代码:
class MultiInputLinear(Layer):
def __init__(self, output_dim=32, n_inputs=2):
super(Linear, self).__init__()
self.output_dim = output_dim
self.n_inputs = n_inputs
def build(self, input_shapes):
self.input_dim = input_shapes[0][1]
self.W = tf.concat(
[
self.add_weight(
name=f'W_{i}',
shape=(self.input_dim, self.output_dim),
initializer='random_normal',
trainable=True
) for i in range(self.n_inputs)
], axis=0
)
def call(self, inputs):
supports = tf.concat(inputs, axis=-1)
return tf.matmul(supports, self.W)
N = 100
A = [np.random.normal(size=(N, N)) for _ in range(2)]
y = np.random.binomial(1, .1, size=(N, 32))
A_in = [Input(batch_size=N, shape=(N, )) for _ in range(2)]
Y = MultiInputLinear(y.shape[1], 2)(A_in)
model = Model(inputs=A_in, outputs=Y)
model.compile(loss='categorical_crossentropy', optimizer=Adam())
model.fit(A, y, batch_size=N)
但是,如果在build
中,我将保留这样的列表:
self.W_list = [
self.add_weight(
name=f'W_{i}',
shape=(self.input_dim, self.output_dim),
initializer='random_normal',
trainable=True
) for i in range(self.n_inputs)
]
然后在call
内将它们连接起来,就像下面一样,没有问题:
def call(self, inputs):
supports = tf.concat(inputs, axis=-1)
W = tf.concat(self.W_list, axis=0)
return tf.matmul(supports, W)
我想知道是什么原因造成的。