使用批量梯度下降训练单个线性神经元用于回归

时间:2016-03-02 16:17:06

标签: python python-3.x numpy matplotlib machine-learning

我尝试使用渐变下降来训练一些重量但是我没有取得多大成功。 我开始的学习率lr为0.01,而我的成本实际上是飞涨的,让我感到惊讶。我只能假设它不够小,无法找到任何本地最小值。将其更改为0.0000000000001可使其稳定并缓慢下降。

迭代998 |费用:2444.995584

迭代999 |费用:2444.995577

迭代1000 |费用:2444.995571

最终体重:5.66633309647e-07 | 4.32179246434e-09

然而,这些重量或者我如何绘制它们都有问题:

enter image description here

import numpy as np
import matplotlib.pyplot as plt


def gradient_descent(x, y, w, lr, m, iter):
    xTrans = x.transpose()
    for i in range(iter):
        prediction = np.dot(x, w)
        loss = prediction - y
        cost = np.sum(loss ** 2) / m

        print("Iteration %d | Cost: %f" % (i + 1, cost))

        gradient = np.dot(xTrans, loss) / m     # avg gradient

        w = w - lr * gradient   # update the weight vector

    return w

# generate data from uniform distribution -10. +10 and linear function
x = np.arange(1, 200, 2)
d = np.random.uniform(-10, 10, x.size)
y = .4 * x + 3 + d

# number of training samples
m = y.size

# add a column of ones for bias values
it = np.ones(shape=(m, 2))
it[:, 1] = x

m, n = np.shape(it)

# initialise weights to 0
w = np.zeros(n)

iter = 1000             # number of iterations
lr = 0.0000000000001    # learning rate / alpha

trained_w = gradient_descent(it, y, w, lr, m, iter)
result = trained_w[1] * x + trained_w[0]    # linear plot of our predicted function
print("Final weights: %s | %s" % (trained_w[1], trained_w[0]))

plt.plot(x, y, 'gx')
plt.plot(x, result)

plt.show()

1 个答案:

答案 0 :(得分:1)

你过度补偿了。这里的学习率非常小,需要数十亿次迭代才能收敛。将其设置为小于0.01的值,但大于现在的值。

对我来说,0.0001的alpha值很合适。