如何用python实现小批量梯度下降的动力?

时间:2017-07-26 23:57:13

标签: python machine-learning gradient-descent

我正在阅读有关momentum的内容,我正试图在我的小批量代码中实现动量方程式。 enter image description here

问题是它没有工作,回归线离理想线太远了,我不确定实现是否正确。

enter image description here

def stochastic_gradient_descent_step(m,b,data_sample):

    n_points = data_sample.shape[0] #size of data
    m_grad = 0
    b_grad = 0
    stepper = 0.0001 #this is the learning rate
    z_m = 1.0
    z_b = 1.0
    betha = 0.81

    for i in range(n_points):

        #Get current pair (x,y)
        x = data_sample[i,0]
        y = data_sample[i,1]
        if(math.isnan(x)|math.isnan(y)): #it will prevent for crashing when some data is missing
            #print("is nan")
            continue

        #you will calculate the partical derivative for each value in data
        #Partial derivative respect 'm'
        dm = -((2/n_points) * x * (y - (m*x + b)))

        #Partial derivative respect 'b'
        db = - ((2/n_points) * (y - (m*x + b)))


        #Update gradient
        m_grad = m_grad + dm
        b_grad = b_grad + db

    #calculate the momentum
    z_m = betha*z_m + m_grad
    z_b = betha*z_b + b_grad
    #Set the new 'better' updated 'm' and 'b'   
    m_updated = m - stepper*z_m
    b_updated = b - stepper*z_b

返回m_updated,b_updated

被修改

我现在已经编辑了我的代码,因为Sasha建议我将梯度计算放在一个函数中,而将动量放在另一个函数中,我将z_m和z_b设置为全局,这样它们就不会在每次迭代中丢失它们的值。 / p>

z_m =0.0 #initilise to 0
z_b =0.0 #initilise to 0
def getGradient(m,b,data_sample):
    global z_m
    global z_b
    n_points = data_sample.shape[0] #size of data
    m_grad = 0
    b_grad = 0
    stepper = 0.0001 #this is the learning rate

    betha = 0.81

    for i in range(n_points):

        #Get current pair (x,y)
        x = data_sample[i,0]
        y = data_sample[i,1]
        if(math.isnan(x)|math.isnan(y)): #it will prevent for crashing when some data is missing
            #print("is nan")
            continue

        #you will calculate the partical derivative for each value in data
        #Partial derivative respect 'm'
        dm = -((2/n_points) * x * (y - (m*x + b)))

        #Partial derivative respect 'b'
        db = - ((2/n_points) * (y - (m*x + b)))


        #Update gradient
        m_grad = m_grad + dm
        b_grad = b_grad + db


    return m_grad,b_grad

def calculateMomentum(m_grad,b_grad,betha=0.81,stepper=0.0001):
    global z_m,z_b
    #calculate the momentum
    z_m = betha*z_m + m_grad
    z_b = betha*z_b + b_grad
    #Set the new 'better' updated 'm' and 'b'   
    m_updated = m - stepper*z_m
    b_updated = b - stepper*z_b
    return m_updated,b_updated 

现在正确计算回归线(可能)。对于SGD,最终误差为59706304,动量最终误差为56729062,但可能是在计算梯度时选择随机小批量。

enter image description here

1 个答案:

答案 0 :(得分:0)

首先初始化无效,z_m和z_b应初始化为0(因为这是您对渐变的第一次猜测)。其次,在当前的函数形式中,您从未在下一次迭代中“存储”z_m或z_b,因此它们会被重置(无效值为1)