线性回归中的二次假设总是错误的

时间:2020-05-11 15:57:18

标签: python machine-learning linear-regression

我是一个初学者,代码纯粹是实验性的,可能不方便,因此对此感到抱歉(例如:我在for循环中使用大量for连续循环)

假设的形式为:-ax ^ 2 + bx + c,参数为a,b和c。

测试数据和输出

x=[10,20,30,40,50]
y=[-50,-90,-130,-170,-210]

假设函数

def h(m):
    return a*(x[m]**2)+b*x[m]+c

更改参数

for i in range(1,300000000):
    for j in range (0,4):
        c1=c1+(h(j)-y[j])
        b1=b1+(h(j)-y[j])*(x[j])
        a1=a1+(h(j)-y[j])*(x[j])**2

a=a-0.00000046*0.20*a1
b=b-0.00000046*0.20*b1
c=c-0.00000046*0.20*c1

成本函数

for k in range(0, 4):
    s = s + (h(k) - y[k])**2
cost=1/5*s

答案应该是a = 0,b = -4和c = -10,但我得到a = 0.02(在每次循环后都增加),b = -5.4946和c = 11(当成本约为2时)(我没有并没有真正结束代码执行,因此给您带来的不便深表歉意。

我哪里出错了?

1 个答案:

答案 0 :(得分:0)

您几乎一切都正确,只有两个问题。查看内嵌评论:

# reliance on global/upper scope variables is a bad practice
def h(a, b, c, value):
    return a*(value**2)+b*value+c

# don't repeat yourself - define it in one place rather than 3
lr = 0.00000046*0.20  # learning rate
for i in range(1,300000000):

    # first issue: a1, b1 and c1 need to be reset
    a1 = b1 = c1 = 0

    # hardcoding magic constants (including data shape) is a bad practice
    for x_, y_ in range zip(x, y):
        diff = h(a, b, c, x_)-y_
        c1 += diff
        b1 += diff * x[j]
        a1 += diff * x[j]**2

    # second issue: this needs to be done every "batch"
    a -= lr * a1
    b -= lr * b1
    c -= lr * c1

向量形式的效率会更高,但是希望您能理解

相关问题