在线性回归上使用梯度下降会产生不正确的偏差

时间:2016-10-24 05:56:41

标签: python machine-learning

我有一个玩具示例,它设置了一个带有一个输入变量和一个输出变量的线性回归模型。我遇到的问题是偏差的输出远离生成的数据。如果我手动设置偏差,那么它将产生一个足够接近原始的重量和偏差。

我写了两段代码gen_data来生成数据,GradientDescent执行梯度下降算法来找到权重和偏差。

def gen_data(num_points=50, slope=1, bias=10, x_max=50):
    f = lambda z: slope * z + bias
    x = np.zeros(shape=(num_points, 1))
    y = np.zeros(shape=(num_points, 1))

    for i in range(num_points):
        x_temp = np.random.uniform()*x_max
        x[i] = x_temp
        y[i] = f(x_temp) + np.random.normal(scale=3.0)

    return (x, y)

# \mathbb{R}^1 with no regularization
def gradientDescent2(x, y, learning_rate=0.0001, epochs=100):
    theta = np.random.rand()
    bias = np.random.rand()

    for i in range(0, epochs):
        loss = (theta * x + bias) - y
        cost = np.mean(loss**2) / 2
        # print('Iteration {} | Cost: {}'.format(i, cost))

        grad_b = np.mean(loss)
        grad_t = np.mean(loss*x)

        # updates
        bias -= learning_rate * grad_b
        theta -= learning_rate * grad_t

    return (theta, bias)

1 个答案:

答案 0 :(得分:-2)

  1. 如果您想使用批量更新,请不要将batch_size设置为等于您的简单尺寸。 (我也相信batch_update不适合这种情况。)
  2. 2.您的渐变计算和参数更新不正确,渐变应为:

    grad_b =  1
    grad_t =  x
    

    对于参数更新,您应该始终尝试最小化loss,因此它应该是

    if loss>0:
      bias -= learning_rate * grad_b
      theta -= learning_rate * grad_t
    elif loss< 0:
      bias += learning_rate * grad_b
      theta += learning_rate * grad_t
    

    毕竟,下面是修改后的代码效果很好。     导入numpy为np     import sys

    def gen_data(num_points=500, slope=1, bias=10, x_max=50):
        f = lambda z: slope * z + bias
        x = np.zeros(shape=(num_points))
        y = np.zeros(shape=(num_points))
    
        for i in range(num_points):
            x_temp = np.random.uniform()*x_max
            x[i] = x_temp
            y[i] = f(x_temp) #+ np.random.normal(scale=3.0)
            #print('x:',x[i],'        y:',y[i])
    
        return (x, y)
    
    def gradientDescent2(x, y, learning_rate=0.001, epochs=100):
        theta = np.random.rand()
        bias = np.random.rand()
    
        for i in range(0, epochs):
          for j in range(len(x)):
            loss = (theta * x[j] + bias) - y[j]
            cost = np.mean(loss**2) / 2
            # print('Iteration {} | Cost: {}'.format(i, cost))
    
            grad_b = 1
            grad_t = x[j]
    
            if loss>0:
                bias -= learning_rate * grad_b
                theta -= learning_rate * grad_t
            elif loss< 0:
                bias += learning_rate * grad_b
                theta += learning_rate * grad_t
    
        return (theta, bias)
    
    def main():
        x,y =gen_data()
        ta,bias = gradientDescent2(x,y)
        print('theta:',ta)
        print('bias:',bias)
    
    if __name__ == '__main__':
        sys.exit(int(main() or 0))