Question

我正在使用keras在Deep Nets上工作。有一个激活＆＃34;硬sigmoid＆＃34;。它的数学定义是什么？

我知道什么是Sigmoid。有人在Quora上提出了类似的问题：https://www.quora.com/What-is-hard-sigmoid-in-artificial-neural-networks-Why-is-it-faster-than-standard-sigmoid-Are-there-any-disadvantages-over-the-standard-sigmoid

但我无法在任何地方找到精确的数学定义？

Answer 1

由于Keras同时支持Tensorflow和Theano，因此每个后端的确切实现可能会有所不同 - 我只会介绍Theano。对于Theano后端，Keras使用T.nnet.hard_sigmoid，而linearly approximated standard sigmoid：

slope = tensor.constant(0.2, dtype=out_dtype)
shift = tensor.constant(0.5, dtype=out_dtype)
x = (x * slope) + shift
x = tensor.clip(x, 0, 1)

即。它是：max(0, min(1, x*0.2 + 0.5))

Answer 2

作为参考，hard sigmoid function可能在不同的地方有不同的定义。在Courbariaux等人。 2016 [1]它被定义为：

σ是“硬sigmoid”函数：σ（x）= clip（（x + 1）/ 2,0,1）= max（0，min（1，（x + 1）/ 2））

目的是提供概率值（因此将其约束在0和1之间）以用于神经网络参数的随机二值化（例如，权重，激活，梯度）。您使用硬sigmoid函数返回的概率p = σ(x)将参数x设置为+1概率为p，或-1概率1-p }。

[1] https://arxiv.org/abs/1602.02830 - “二值化神经网络：训练深度神经网络，权重和激活约束为+1或-1”，Matthieu Courbariaux，Itay Hubara，Daniel Soudry，Ran El-Yaniv，Yoshua Bengio ，（2016年2月9日提交（v1），最后修订日期为2016年3月17日（本版本，第3版））

Answer 3

硬S形通常是逻辑S形函数的分段线性近似。根据要保留的原始S型特征的性质，可以使用其他近似值。

我个人希望将函数正确设置为零，即σ(0) = 0.5（移位）和σ'(0) = 0.25（斜率）。可以这样编码：

def hard_sigmoid(x):
    return np.maximum(0, np.minimum(1, (x + 2) / 4))

Answer 4

它是

  clip((x + 1)/2, 0, 1)

编码用语中的

：

  max(0, min(1, (x + 1)/2))

如何定义硬Sigmoid

4 个答案: