我正在尝试实现链接http://cs229.stanford.edu/notes/cs229-notes1.pdf中解释的渐变下降。下面的代码返回指数级大的参数,如果我增加迭代次数,则params达到无穷大。我浪费了4个小时试图弄清楚什么是错的。请帮帮我们。
import pandas as pd
import numpy as np
advertising_data = pd.read_csv("http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv", index_col=0)
target = np.array(advertising_data.Sales.values)
advertising_data["ones"] = np.ones(200)
advertising_data = advertising_data[["ones", "TV"]]
features = np.array(advertising_data.values)
def error_ols(target, features):
def h(betas):
error = target - np.dot(features, betas)
return error
return h
def ols_loss(errors):
return np.sum(errors*errors)
def gradient_descend(initial_guess, learning_step, gradient, iterations = 10):
for i in range(0, iterations):
update = initial_guess + learning_step*gradient( initial_guess)
initial_guess = update
error = error_ols(target, features)(update)
print ols_loss(error)
return update
def ols_gradient(target, features):
def h(betas):
error = target - np.dot(features, betas)
return -np.dot(error, features)
return h
gradient_function = ols_gradient(target, features)
initial_guess = np.array([1,1])
gradient_descend(initial_guess, 0.0001, gradient_function)
答案 0 :(得分:3)
花了很长时间经历这个;将其视为“注重细节运动”。
def ols_gradient(target, features):
def h(betas):
error = target - np.dot(features, betas)
return np.dot(error, features)
return h
确保将学习率降至.0000001。
有趣的是,最小的错误是最难发现的。