我正在尝试创建序数回归模型,如本paper所述。它的主要部分是在最后一层共享权重,而不是为了获得秩单调性而分配偏见(基本上确保对于任何这样的N,P [Y> N]必须始终大于P [Y> N-1]) 。这对我来说是非常理想的,因为我有一些价值很少的价值,但我仍然希望获得它们的概率。到目前为止,我已经实现了它编码数字的方式,并且没有等级单调性,因为有时P(Y> 5)> P(Y> 4)的可能性。
在Keras中,我该如何精确地实现体重共享而不是偏见共享?我知道功能性API可以共享权重和偏差,但这在这种情况下无济于事。感谢任何可以提供帮助的人。
编辑:在N个神经元的一个层内以及N个层之间共享权重但不偏重将解决此问题。另外,我认为将Dense()中的use_bias参数设置为false并创建某种自定义Bias层也可以解决该问题,但是我不确定如何做到这一点
我认为六个神经元和五个输入的方程式就是这样
op1 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b1
op2 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b2
op3 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b3
op4 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b4
op5 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b5
op6 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b6
其中w1到w5是权重,z1到z5是输入,b1到b6是偏置项
答案 0 :(得分:2)
实现此目标的方法之一是定义一个自定义bias
层,这是实现此目标的方法。
PS:根据需要更改输入形状/初始值设定。
import tensorflow as tf
print('TensorFlow:', tf.__version__)
class BiasLayer(tf.keras.layers.Layer):
def __init__(self, units, *args, **kwargs):
super(BiasLayer, self).__init__(*args, **kwargs)
self.bias = self.add_weight('bias',
shape=[units],
initializer='zeros',
trainable=True)
def call(self, x):
return x + self.bias
z1 = tf.keras.Input(shape=[1])
z2 = tf.keras.Input(shape=[1])
z3 = tf.keras.Input(shape=[1])
z4 = tf.keras.Input(shape=[1])
z5 = tf.keras.Input(shape=[1])
dense_layer = tf.keras.layers.Dense(units=10, use_bias=False)
op1 = BiasLayer(units=10)(dense_layer(z1))
op2 = BiasLayer(units=10)(dense_layer(z2))
op3 = BiasLayer(units=10)(dense_layer(z3))
op4 = BiasLayer(units=10)(dense_layer(z4))
op5 = BiasLayer(units=10)(dense_layer(z5))
model = tf.keras.Model(inputs=[z1, z2, z3, z4, z5], outputs=[op1, op2, op3, op4, op5])
model.summary()
输出:
TensorFlow: 2.1.0-dev20200107
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_3 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_4 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_5 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 10) 10 input_1[0][0]
input_2[0][0]
input_3[0][0]
input_4[0][0]
input_5[0][0]
__________________________________________________________________________________________________
bias_layer (BiasLayer) (None, 10) 10 dense[0][0]
__________________________________________________________________________________________________
bias_layer_1 (BiasLayer) (None, 10) 10 dense[1][0]
__________________________________________________________________________________________________
bias_layer_2 (BiasLayer) (None, 10) 10 dense[2][0]
__________________________________________________________________________________________________
bias_layer_3 (BiasLayer) (None, 10) 10 dense[3][0]
__________________________________________________________________________________________________
bias_layer_4 (BiasLayer) (None, 10) 10 dense[4][0]
==================================================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
__________________________________________________________________________________________________