实现神经网络的动量权重更新

时间:2017-11-27 22:21:06

标签: python numpy machine-learning neural-network mnist

我跟随mnielsen的在线book。我尝试将here定义的动量权重更新实施到他的代码here。总体思路是,对于动量权重更新,您不能使用负梯度直接更改权重向量。您有一个参数velocity,您可以将其设置为零,然后将超参数mu设置为0.9

# Momentum update
v = mu * v - learning_rate * dx # integrate velocity
x += v # integrate position

所以我在下面的代码片段中有权重w并且重量变为nebla_w

def update_mini_batch(self, mini_batch, eta):
        """Update the network's weights and biases by applying
        gradient descent using backpropagation to a single mini batch.
        The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``
        is the learning rate."""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        for x, y in mini_batch:
            delta_nabla_b, delta_nabla_w = self.backprop(x, y)
            nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
            nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
        self.weights = [w-(eta/len(mini_batch))*nw
                        for w, nw in zip(self.weights, nabla_w)]
        self.biases = [b-(eta/len(mini_batch))*nb
                       for b, nb in zip(self.biases, nabla_b)]

所以在最后两行中,您将self.weight更新为

self.weights = [w-(eta/len(mini_batch))*nw
                for w, nw in zip(self.weights, nabla_w)]

对于动量权重更新我执行以下操作:

self.momentum_v = [ (momentum_mu * self.momentum_v) - ( ( float(eta) / float(len(mini_batch)) )* nw) 
                   for nw in nebla_w ]
self.weights = [ w + v 
                for w, v in zip (self.weights, self.momentum_v)]

但是,我收到了以下错误:

 TypeError: can't multiply sequence by non-int of type 'float'

用于momentum_v更新。我的eta超参数已经浮动了,虽然我再用浮动函数包装它。我也用浮动包裹len(mini_batch)。我也试过nw.astype(float),但我仍然会收到错误。我不知道为什么。 nabla_w是一系列花车。

0 个答案:

没有答案