Question

考虑使用RELU激活的一个隐藏层神经网络。似乎在神经网络中通常的做法是标准化权重，使它们分布为$ \ mathcal {N}（0,1 / sqrt（d））$，其中$ d $是数据的维数。

假设我在将数据发送到网络之前对其进行规范化。问题是规范化输入数据的最佳做法是什么。主要关注的是，RELU对负值的梯度为零，学习将停止，创建死亡神经元＆＃39;。

我在scikit-learn speak中包含了两个不同的初始化（StandardScaler和MinMaxScaler）。我意识到直方图并不是决定性的，但可能是一个开始。请特别注意MinMaxScaler s的方差如何较小。此外，我知道这些偏见可以推动负值，但我再次看到的偏差的常见初始化为零。

import numpy as np
import matplotlib.pyplot as plt

N = 10000
D = 100
Dout = 20

# Gaussian intialisation (0 mean 1 standard deviation)
x = np.random.randn(N, D)
w = np.random.randn(D, Dout)*1/np.sqrt(Dout)
y = x.dot(w)

plt.hist(y.ravel(),100)
plt.show()

# MinMaxScaler (values between 0 and 1)
x = np.random.rand(N, D)
y = x.dot(w)

plt.hist(y.ravel(),100)
plt.show()

Answer 1

我建议在你的情况下使用RELU漏洞和一些优化学习率的方法，并加上保存计算this answer。

    from matplotlib import pyplot as plt
    import numpy as np

    def relu_leaky(x):
        return np.where(x>=0,x,x*0.1)        

    def relu_deriv(x):
        return np.where(x>0,1,0.1)

    X = np.arange(-10,10)
    y1 = relu_leaky(X)
    y2 = relu_deriv(y1)


    plt.plot(X,y1)
    plt.scatter(X,y2,c='red')

标准化RELU神经网络

1 个答案: