线性回归收敛,但结果不好

时间:2019-03-30 17:50:30

标签: c machine-learning linear-regression

重要:我是ML的初学者,我想自己实现我正在学习的算法,而无需使用ML库。

我有一个价格(y)表示公里数(x)的数据集,我想找到描述数据的函数。 您可以在此处找到数据集和整个代码:https://wetransfer.com/downloads/034d9918f6d29268f06be45d76e156f420190330174420/6af73b

我使用的是经典的梯度下降算法:我的代码可以很好地解决某些线性回归问题,但对我而言却不是。


/* Classic gradient descent algorithm */

ft_sum(double *x, double *y, long double theta0, long double theta1, int epoch, int truth)
{
    long double     result = 0.00;
    long double     tmp;
    int             i;

    i = 0;
    while (epoch--)
       {
         /* Derivative part of the gradient descent */
        tmp = ((x[i] * theta1 + theta0)) - (y[i]);
        if (truth == 1)
            tmp = tmp * (x[i]);
        result += tmp;
        i++;
    }
    return (result);
}

/* Linear regression */

void        single_linear_regression(double *x, double *y, double epoch, char *argv)
{
    long double     theta0 = 0; /* bias */
    long double     theta1 = 0; /* weight */
    long double     error = 100; /* Cost of the function */
    long double     tmp1;
    long double     tmp2;
    double          alpha = 0.0000000001; /* with higher learning rate it does not converge */
    int             i = 0;

    while (!(error > -0.4 && error < 0.4)) // it doesn't go below 0.4
    {
        tmp1 = theta0 - ((alpha * (1.00 / epoch) *
            (error = ft_sum(x, y, theta0, theta1, epoch - 1, 0))));
        tmp2 = theta1 - ((alpha * (1.00 / epoch) *
                (error = ft_sum(x, y, theta0, theta1, epoch - 1, 1))));
        theta0 = tmp1;
        theta1 = tmp2;
        printf("error := %Lf\n", error);
    }
    printf("error := %Lf | theta0 == %Lf | theta1 == %Lf\n", error, theta0, theta1);
}

最后,我有:

错误:= 0.240723 | theta0 == 0.000004 | theta1 == 0.044168

(f(x)= 0.044x + 0.000004) 当实际功能是-0.02x + 8500 ...

我已经尝试过对数据[0-1]进行归一化,更改权重和偏差的起始值,而我真的对此感到困惑。

0 个答案:

没有答案