Question

我试图在Keras中构建的DNN上复制https://github.com/MatthieuCourbariaux/BinaryConnect中使用的二值化过程，在确定权重梯度之前，他们使用以下两个函数随机地进行二值化：

# hard_sigmoid(); Clips the input, x, to [0,1]
#   Input Parameters
#       W-                Input weight to transform/binarize
#   Output Parameters
#       Transformed/binarized weight
def hard_sigmoid(x):    
    return T.clip((x+1.)/2.,0,1) # Here the activation function f(x) = (x + 1)/2

请注意，binarization（）函数最初是为Lasagne编写的，这是一个现已过时的神经网络Python库。

# binarization(); Used to binarize network weights
#   Input Parameters
#       W-                Input weight to transform/binarize
#       H-                Upper and lower weights to bound to [+1,-1 by default]
#       Binary-           Binarize the weight using a hard sigmoid transformation 
#       Deterministic-    Round weight to nearest integer
#       Stochastic-       Map with probabilities p = hard_sigmoid(w) and 1 - p       
#   Output Parameters
#       Transformed/binarized weight
def binarization(W,H,binary=True,deterministic=False,stochastic=False,srng=None):
    # If Binary is False or (Deterministic and Stochastic are True)
    if not binary or (deterministic and stochastic):
        Wb = W  # Do not binarize the weight
    else: 
        Wb = hard_sigmoid(W/H) # Binarize the weight    
        # Stochastic BinaryConnect [with probabilities p = hard_sigmoid(w) and 1 - p]
        if stochastic:
            # Sample n=1 times with probability of success p=Wb for each trial and return the number of successes
            # Cast the stochastic result to dtype 'float64'
            Wb = T.cast(srng.binomial(n=1, p=Wb, size=T.shape(Wb)), theano.config.floatX)
        # Deterministic BinaryConnect [round to nearest]
        else:
            Wb = T.round(Wb) # Round the current weight to the nearest integer

        # Map [0,1] to [-1,1]; If Wb == 1, return H; If Wb == 0, return -H
        Wb = T.cast(T.switch(Wb,H,-H), theano.config.floatX)
    return Wb # Return the transformed weight

我已经能够使用以下代码成功初始化具有-1和+1之间的均匀随机分布的图层：

model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, 
    kernel_initializer=keras.initializers.RandomUniform(minval=-1., maxval=1., seed=None), activation='relu'))

使用fit函数训练源代码模型。我找到了类Adamax（Optimizer）：在optimizers.py但是无法弄清楚何时调用get_updates-更具体地说，当确定每个权重的梯度并调用优化器来更新它们时。

非常感谢任何帮助。

Keras-前向和后向传播之前的二值化权重

0 个答案: