Tensorflow自定义图层权重不是训练而是偏差是

时间:2020-03-27 17:03:57

标签: python keras tensorflow2.0

我一直在写一些自定义图层,我已经意识到我的偏差值可以训练,但是我的权重却不能训练。我将在这里使用非常简化的代码来说明问题。

class myWeights(Layer):
    def __init__(self, units, **kwargs): 
        self.units = units
        super(myWeights, self).__init__(**kwargs)      
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                         initializer='GlorotUniform',
                         trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                         initializer='random_normal',
                         trainable=True)
        super(myWeights, self).build(input_shape)
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    def compute_output_shape(self, input_shape):
        return(input_shape[0],self.units)

现在,我设置了要训练的MNIST数据。我还设置了一个种子,这样可以在您的末端重现。

tf.random.set_seed(1234)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train=tf.keras.utils.normalize(x_train, axis=1)
x_test=tf.keras.utils.normalize(x_test, axis=1)

我使用功能性API构建模型

inp=Input(shape=(x_train.shape[1:]))
flat=Flatten()(inp)
hid=myWeights(32)(flat)
out=Dense(10, 'softmax')(hid)
model=Model(inp,out)
model.compile(optimizer='adam',
         loss='sparse_categorical_crossentropy',
         metrics=['accuracy'])

现在,当我使用来检查参数的值时

print(model.layers[2].get_weights())

我看到以下输出,我将其重新格式化以便于阅读。

  • [array([[0.00652369,-0.02321771,0.01399945,...,-0.07599965, -0.04356881,-0.0333882], [-0.03132245,-0.05264733、0.05576386,...,-0.03755575, 0.07358163,-0.02338506], [-0.01808248、0.04092623、0.02177643,...,0.00971264, 0.07631209,0.0495184], ..., [-0.03780914、0.00219346、0.04460619,...,-0.06703794, 0.03407502,-0.01071112], [-0.0012739,-0.0683699,-0.06152753,...,0.05373723, 0.03079057,0.00855774], [0.06245673,-0.07649396,0.06748571,...,-0.06948434, -0.01416317,-0.08318184]],dtype = float32),*
  • array([0.05734033,0.04822996,0.04391507,-0.01550511,0.05383257, 0.05043739,-0.04092903,-0.0081823,-0.06425817、0.02402171, -0.00374672,-0.06069579,-0.08422226、0.02909392,-0.02071654, 0.0422841,-0.05020861、0.01267704、0.0365625,-0.01743891, -0.01030697、0.00639807,-0.01493454、0.03214667、0.03262959, 0.07799669、0.05789128、0.01754347,-0.07558075、0.0466203, -0.05332188,0.00270758],dtype = float32)] *

经过培训

model.fit(x_train,y_train, epochs=3, verbose=1)
print(model.layers[2].get_weights())

我找到以下输出。

  • [array([[0.00652369,-0.02321771,0.01399945,...,-0.07599965, -0.04356881,-0.0333882], [-0.03132245,-0.05264733、0.05576386,...,-0.03755575, 0.07358163,-0.02338506], [-0.01808248、0.04092623、0.02177643,...,0.00971264, 0.07631209,0.0495184], ..., [-0.03780914、0.00219346、0.04460619,...,-0.06703794, 0.03407502,-0.01071112], [-0.0012739,-0.0683699,-0.06152753,...,0.05373723, 0.03079057,0.00855774], [0.06245673,-0.07649396,0.06748571,...,-0.06948434, -0.01416317,-0.08318184]],dtype = float32),*
  • array([-0.250459,-0.21746232、0.01250297、0.00065066,-0.09093136, 0.04943814,-0.13446714,-0.11985168、0.23259214,-0.14288908, 0.03274751、0.1462888,-0.2206902、0.14455307、0.17767513, 0.11378342,-0.22250313、0.11601174,-0.1855521、0.0900097, 0.21218981,-0.03386492,-0.06818825、0.34211585,-0.24891953, 0.08827516、0.2806849、0.07634751,-0.32905066,-0.1860122, 0.06170518,-0.20212872],dtype = float32)] *

我可以看到偏差值已更改,但权重值是静态的。我完全不确定为什么会这样。

1 个答案:

答案 0 :(得分:0)

您要尝试的是多层感知器(MLP) MLP ,通常由一个(直通)输入层,一层或多层组成 TLU的最后一层,称为隐藏层,最后一层称为 输出层。

此处信号仅沿一个方向(从输入到输出)流动,因此 体系结构是前馈神经网络(FNN)的示例。

请参阅此link,它将解释前馈神经网络。

在解释代码时,您正在使用一些初始化程序初始化权重。因此,权重的第一次初始化发生在隐藏层,然后在下一个 Dense 层进行更新。
因此,即使初始化了权重,即使在隐藏层中进行训练后,权重仍将保持不变,因为它是前馈神经网络,意味着它不依赖于当前层的输出。

但是,如果您要检查代码,则可以再添加一层与现有层完全相同的隐藏层,然后查看第3层(隐藏的第2层)的权重,如下所示。

inp=Input(shape=(x_train.shape[1:]))
flat=Flatten()(inp)
hid=myWeights(32)(flat)
hid2=myWeights(32)(hid)
out=Dense(10, 'softmax')(hid2)
model=Model(inp,out)
model.compile(optimizer='adam',
         loss='sparse_categorical_crossentropy',
         metrics=['accuracy'])

然后,通过在隐藏2层的适合之前和之后打印权重,您将获得不同的权重,因为隐藏2层的权重取决于隐藏1层的输出。

print(model.layers[3].get_weights())