我尝试使用渐变下降来训练一些重量但是我没有取得多大成功。
我开始的学习率lr
为0.01,而我的成本实际上是飞涨的,让我感到惊讶。我只能假设它不够小,无法找到任何本地最小值。将其更改为0.0000000000001可使其稳定并缓慢下降。
迭代998 |费用:2444.995584
迭代999 |费用:2444.995577
迭代1000 |费用:2444.995571
最终体重:5.66633309647e-07 | 4.32179246434e-09
然而,这些重量或者我如何绘制它们都有问题:
import numpy as np
import matplotlib.pyplot as plt
def gradient_descent(x, y, w, lr, m, iter):
xTrans = x.transpose()
for i in range(iter):
prediction = np.dot(x, w)
loss = prediction - y
cost = np.sum(loss ** 2) / m
print("Iteration %d | Cost: %f" % (i + 1, cost))
gradient = np.dot(xTrans, loss) / m # avg gradient
w = w - lr * gradient # update the weight vector
return w
# generate data from uniform distribution -10. +10 and linear function
x = np.arange(1, 200, 2)
d = np.random.uniform(-10, 10, x.size)
y = .4 * x + 3 + d
# number of training samples
m = y.size
# add a column of ones for bias values
it = np.ones(shape=(m, 2))
it[:, 1] = x
m, n = np.shape(it)
# initialise weights to 0
w = np.zeros(n)
iter = 1000 # number of iterations
lr = 0.0000000000001 # learning rate / alpha
trained_w = gradient_descent(it, y, w, lr, m, iter)
result = trained_w[1] * x + trained_w[0] # linear plot of our predicted function
print("Final weights: %s | %s" % (trained_w[1], trained_w[0]))
plt.plot(x, y, 'gx')
plt.plot(x, result)
plt.show()
答案 0 :(得分:1)
你过度补偿了。这里的学习率非常小,需要数十亿次迭代才能收敛。将其设置为小于0.01
的值,但大于现在的值。
对我来说,0.0001
的alpha值很合适。