我试图在Keras中构建的DNN上复制https://github.com/MatthieuCourbariaux/BinaryConnect中使用的二值化过程,在确定权重梯度之前,他们使用以下两个函数随机地进行二值化:
# hard_sigmoid(); Clips the input, x, to [0,1]
# Input Parameters
# W- Input weight to transform/binarize
# Output Parameters
# Transformed/binarized weight
def hard_sigmoid(x):
return T.clip((x+1.)/2.,0,1) # Here the activation function f(x) = (x + 1)/2
请注意,binarization()函数最初是为Lasagne编写的,这是一个现已过时的神经网络Python库。
# binarization(); Used to binarize network weights
# Input Parameters
# W- Input weight to transform/binarize
# H- Upper and lower weights to bound to [+1,-1 by default]
# Binary- Binarize the weight using a hard sigmoid transformation
# Deterministic- Round weight to nearest integer
# Stochastic- Map with probabilities p = hard_sigmoid(w) and 1 - p
# Output Parameters
# Transformed/binarized weight
def binarization(W,H,binary=True,deterministic=False,stochastic=False,srng=None):
# If Binary is False or (Deterministic and Stochastic are True)
if not binary or (deterministic and stochastic):
Wb = W # Do not binarize the weight
else:
Wb = hard_sigmoid(W/H) # Binarize the weight
# Stochastic BinaryConnect [with probabilities p = hard_sigmoid(w) and 1 - p]
if stochastic:
# Sample n=1 times with probability of success p=Wb for each trial and return the number of successes
# Cast the stochastic result to dtype 'float64'
Wb = T.cast(srng.binomial(n=1, p=Wb, size=T.shape(Wb)), theano.config.floatX)
# Deterministic BinaryConnect [round to nearest]
else:
Wb = T.round(Wb) # Round the current weight to the nearest integer
# Map [0,1] to [-1,1]; If Wb == 1, return H; If Wb == 0, return -H
Wb = T.cast(T.switch(Wb,H,-H), theano.config.floatX)
return Wb # Return the transformed weight
我已经能够使用以下代码成功初始化具有-1和+1之间的均匀随机分布的图层:
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels,
kernel_initializer=keras.initializers.RandomUniform(minval=-1., maxval=1., seed=None), activation='relu'))
使用fit函数训练源代码模型。我找到了类Adamax(Optimizer):在optimizers.py但是无法弄清楚何时调用get_updates-更具体地说,当确定每个权重的梯度并调用优化器来更新它们时。
非常感谢任何帮助。