fmin_l_bfgs_b输出的最小梯度不为零

时间:2015-12-31 14:39:38

标签: python optimization machine-learning scipy gradient

我使用fmin_l_bfgs_b来估算函数的最小值。这个问题没有限制。我使用“approx_grad”以数字方式获得最小值。

weights_sp_new, func_val, info_dict = fmin_l_bfgs_b(func_to_minimize, self.w_vectors[si][pj], 
                       args=(self.sigma_vector[si][pj], Y, X, E_step_results[si][pj]),
                       approx_grad=True, factr=10000000.0, pgtol=1e-05, epsilon=1e-04)

我在同一目标函数上尝试了不同的初始猜测。输出的信息字典如下:

     information dictionary: {'nit': 180, 'funcalls': 4480, 'warnflag': 0, 
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH', 
    'grad': array([  1.69003327e+00,   2.29250366e+00,   1.55528930e+00,
                 9.84251656e-01,  -1.10133624e-02,   1.83795773e+00,
                 6.44715933e-01,   2.01643592e+00,   8.71323232e-01,
                 9.93009353e-01,   1.34615338e+00,   4.20859578e-04,
                -2.22691328e-01,  -2.13318804e-01,  -4.38475622e-01,
                 4.79004570e-01,  -4.11879746e-01,   1.71003313e+00])}


        information dictionary: {'nit': 0, 'funcalls': 20, 'warnflag': 0, 
'task': b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL', 
    'grad': array([  1.84672949e-20,   1.49550746e-20,   1.11115003e-20,
                 2.73908962e-20,   0.00000000e+00,   2.62916240e-20,
                 0.00000000e+00,   4.95859400e-20,   4.70618521e-20,
                 4.77249742e-20,   2.80864703e-20,   0.00000000e+00,
                 1.84975333e-21,   7.63125358e-21,   1.35733459e-20,
                 6.34943656e-21,   1.02743864e-20,   5.31287405e-20])}

        information dictionary: {'nit': 107, 'funcalls': 2460, 'warnflag': 0, 
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH', 
    'grad': array([ -3.09184019,  -0.70217764,   0.72096009,  -3.23745189,
                -1.18111435,  -4.13185742,   3.90762754,   2.28011806,
                -3.02289147,  -1.21219666,   1.80007832, -12.44630606,
                -1.59126124,   1.59139978,  -1.96677574,  -0.50837465,
                 1.20439043,  -1.58858602])}

        information dictionary: {'nit': 132, 'funcalls': 2980, 'warnflag': 0, 
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH', 
    'grad': array([ -8.56568098,  -9.39712794,  -8.82591339,  -8.61912864,
                -0.53956945,  -9.46679887,   0.89827947, -10.64991782,
                -6.53652169,  -7.34566878,  -8.98861319,   1.28335021,
                -2.39830071,  -1.2056133 ,  -0.81190425,  -1.3537686 ,
                -1.65028498,  -8.30791505])}

您可以看到它成功收敛。但最小的梯度不是零。我知道这意味着我没有得到确切的最低要求。它可以进一步下降。我现在应该怎么做?或者我可以接受这个“近似”的最小值?

1 个答案:

答案 0 :(得分:1)

提供的样本有两种情况:

  1. 您的算法的第二次运行很好地收敛,b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'并且正如您所见

    &#39; grad&#39 ;: array([1.84672949e-20,1.49550746e-20,1.111115003e-20,                  2.73908962e-20,0.00000000e + 00,2.62916240e-20,                  0.00000000e + 00,4.95859400e-20,4.70618521e-20,                  4.77249742e-20,2.80864703e-20,0.00000000e + 00,                  1.84975333e-21,7.63125358e-21,1.35733459e-20,                  6.34943656e-21,1.02723864e-20,5.31287405e-20])

    基本上为零(最多20位精度)。

  2. 由于功能值b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'缺乏重大变化而导致剩余案例终止,因此您可以执行以下一项(或多项):

    • 从文档

      中减少factr的{​​{1}}参数
        

      factr :float

           

      迭代停止时(f ^ k -   f ^ {k + 1})/ max {| f ^ k |,| f ^ {k + 1} |,1}&lt; = factr * eps,其中eps是   机器精度,由代码自动生成。   factr的典型值为:1e12,精度低; 1e7适中   准确性; 10.0以获得极高的准确度

    • 想想你的功能,也许它可以简化?它是否存在血管问题(非常平坦的表面) - 如果是这样,也许您可​​以交替定义以最小化效果?

    • 计算分析梯度(从而提高精度)
    • 更改fmin_l_bfgs_b,因为您的数值近似可能不足