Question

我在一个非常复杂的最小化问题上使用了scipy.optimize.minimize的trust-krylov方法（这里发布实际代码太长了）。我发现的是，当迭代之间目标函数的微分变化低于“反复”时，例程运行很多次迭代。关键字我设置。让我们调用目标函数J，从迭代i到i + 1的变化是dJ。

我理解＆＃39; tol＆＃39;表示迭代之间目标值dJ的最小可接受变化。所以，如果我设置＆＃39; tol＆＃39;

中的1.e-4

res=minimize(J,X0,method='trust-krylov', tol=1.e-4, jac=Jacobian,hessp=Hessian)

然后我希望代码在dJ下降后停止运行几次迭代并保持低于此值。但是我现在正在运行代码，而dJ低于1.e-8，并且在16次迭代和计数之后仍然以这种方式运行。可能的错误？

Answer 1

您误解了tol参数。

不是关于：|obj_i - obj_i-1|（关于标量的计算），而是关于：||grad_i||_p（关于向量的计算）。

后一种情况经常被使用，并且是大多数非线性优化器的一部分（特别是当没有KKT条件或二阶信息可用时）。它也直接遵循理论：局部最优点的一阶必要最优性条件。

您可以查看来源：

here: tol becomes gtol：

if meth in ('bfgs', 'cg', 'l-bfgs-b', 'tnc', 'dogleg',
            'trust-ncg', 'trust-exact', 'trust-krylov'):
   options.setdefault('gtol', tol)

here: _minimize_trust_krylov is called：

elif meth == 'trust-krylov':
    return _minimize_trust_krylov(fun, x0, args, jac, hess, hessp,
                                  callback=callback, **options)

_trustregion_krylov talks about the oder conditions并根据exact / inexact调用最终优化器：

if inexact:
    return _minimize_trust_region(fun, x0, args=args, jac=jac,
                                  hess=hess, hessp=hessp,
                                  subproblem=get_trlib_quadratic_subproblem(
                                      tol_rel_i=-2.0, tol_rel_b=-3.0,
                                      disp=trust_region_options.get('disp', False)
                                      ),
                                  **trust_region_options)
else:
    return _minimize_trust_region(fun, x0, args=args, jac=jac,
                                  hess=hess, hessp=hessp,
                                  subproblem=get_trlib_quadratic_subproblem(
                                      tol_rel_i=1e-8, tol_rel_b=1e-6,
                                      disp=trust_region_options.get('disp', False)
                                      ),
                                  **trust_region_options)

the optimizer used包含以下几行：

gtol : float
    Gradient norm must be less than `gtol`
    before successful termination.

# check if the gradient is small enough to stop
if m.jac_mag < gtol:
    warnflag = 0
    break

# check if we have looked at enough iterations
if k >= maxiter:
    warnflag = 1
    break

here jac_mag is found：

@property
def jac_mag(self):
    """Magniture of jacobian of objective function at current iteration."""
    if self._g_mag is None:
        self._g_mag = scipy.linalg.norm(self.jac)
    return self._g_mag

在此答案开头之后，使用了欧几里德规范（p = 2）！

scipy最小化＆trust-krylov＆＃39;当更改达到＆＃39; tol＆＃39;

1 个答案: