Question

我有一个相当简单的约束优化问题，但根据我的操作方法会得到不同的答案。首先让我们开始导入和漂亮的打印功能：

import numpy as np
from scipy.optimize import minimize, LinearConstraint, NonlinearConstraint, SR1

def print_res( res, label ):
    print("\n\n ***** ", label, " ***** \n")
    print(res.message)
    print("obj func value at solution", obj_func(res.x))
    print("starting values: ", x0)
    print("ending values:   ", res.x.astype(int) )
    print("% diff", (100.*(res.x-x0)/x0).astype(int) )
    print("target achieved?",target,res.x.sum())

示例数据非常简单：

n = 5
x0 = np.arange(1,6) * 10_000
target = x0.sum() + 5_000   # increase sum from 15,000 to 20,000

这是约束优化（包括jacobian）。换句话说，我要最小化的目标函数只是从初始值到最终值的平方变化百分比的总和。线性等式约束只是要求x.sum()等于常数。

def obj_func(x):
    return ( ( ( x - x0 ) / x0 ) ** 2 ).sum()

def obj_jac(x):
    return 2. * ( x - x0 ) / x0 ** 2

def constr_func(x):
    return x.sum() - target

def constr_jac(x):
    return np.ones(n)

为了进行比较，我通过使用等式约束将x[0]替换为x[1:]函数，将其重构为无约束最小化。请注意，无约束函数是通过x0[1:]传递的，而约束函数是通过x0传递的。

def unconstr_func(x):
    x_one       = target - x.sum()
    first_term  = ( ( x_one - x0[0] ) / x0[0] ) ** 2
    second_term = ( ( ( x - x0[1:] ) / x0[1:] ) ** 2 ).sum()
    return first_term + second_term

然后我尝试通过三种方式最小化：

不受“ Nelder-Mead”的限制
受“ trust-constr”约束（不带雅各布）
受“ SLSQP”（不带雅各布）的约束

代码：

##### (1) unconstrained

res0 = minimize( unconstr_func, x0[1:], method='Nelder-Mead')   # OK, but weird note
res0.x = np.hstack( [target - res0.x.sum(), res0.x] )
print_res( res0, 'unconstrained' )    

##### (2a) constrained -- trust-constr w/ jacobian

nonlin_con = NonlinearConstraint( constr_func, 0., 0., constr_jac )
resTCjac = minimize( obj_func, x0, method='trust-constr',
                     jac='2-point', hess=SR1(), constraints = nonlin_con )
print_res( resTCjac, 'trust-const w/ jacobian' )

##### (2b) constrained -- trust-constr w/o jacobian

nonlin_con = NonlinearConstraint( constr_func, 0., 0. )    
resTC = minimize( obj_func, x0, method='trust-constr',
                  jac='2-point', hess=SR1(), constraints = nonlin_con )    
print_res( resTC, 'trust-const w/o jacobian' )

##### (3a) constrained -- SLSQP w/ jacobian

eq_cons = { 'type': 'eq', 'fun' : constr_func, 'jac' : constr_jac }
resSQjac = minimize( obj_func, x0, method='SLSQP',
                     jac = obj_jac, constraints = eq_cons )    
print_res( resSQjac, 'SLSQP w/ jacobian' )

##### (3b) constrained -- SLSQP w/o jacobian

eq_cons = { 'type': 'eq', 'fun' : constr_func }    
resSQ = minimize( obj_func, x0, method='SLSQP',
                  jac = obj_jac, constraints = eq_cons )
print_res( resSQ, 'SLSQP w/o jacobian' )

这是一些简化的输出（当然，您可以运行代码以获取完整的输出）：

starting values:  [10000 20000 30000 40000 50000]

***** (1) unconstrained  *****
Optimization terminated successfully.
obj func value at solution 0.0045454545454545305
ending values:    [10090 20363 30818 41454 52272]

***** (2a) trust-const w/ jacobian  *****
The maximum number of function evaluations is exceeded.
obj func value at solution 0.014635854609684874
ending values:    [10999 21000 31000 41000 51000]

***** (2b) trust-const w/o jacobian  *****
`gtol` termination condition is satisfied.
obj func value at solution 0.0045454545462939935
ending values:    [10090 20363 30818 41454 52272]

***** (3a) SLSQP w/ jacobian  *****
Optimization terminated successfully.
obj func value at solution 0.014636111111111114
ending values:    [11000 21000 31000 41000 51000]    

***** (3b) SLSQP w/o jacobian  *****   
Optimization terminated successfully.
obj func value at solution 0.014636111111111114
ending values:    [11000 21000 31000 41000 51000]

注意：

（1）和（2b）是可行的解决方案，因为它们实现了明显较低的目标函数值，并且直观地我们希望具有较大初始值的变量的移动幅度（绝对值和百分比）都比较小的。
在'trust-const'中添加jacobian会导致其得到错误的答案（或至少是更差的答案），并且超过了最大迭代次数。也许jacobian错了，但是功能是如此简单，以至于我确定它是正确的（？）
'SLSQP'似乎在不提供jacobian的情况下不起作用，但是起效非常快，并声称可以成功终止。这似乎非常令人担忧，因为得到错误的答案并声称已成功终止是几乎最糟糕的结果。
最初，我使用非常小的起始值和目标（仅为我上面的目标的1 / 1,000），在这种情况下，上述所有5种方法都可以正常工作并给出相同的答案。我的样本数据仍然非常小，要处理1,2,..,5却不能处理1000,2000,..5000似乎有点奇怪。
FWIW，请注意，这3个不正确的结果都通过向每个初始值加上1,000来达到目标-这满足了约束，但远没有使目标函数最小化（应将具有较高初始值的b / c变量设为增加的幅度要高于较低的幅度，以使百分比差异的平方和最小化。）

所以我的问题实际上只是这里发生了什么，为什么只有（1）和（2b）似乎起作用？

更笼统地说，我想找到一个很好的基于python的方法来解决这个问题以及类似的优化问题，并且除了scipy之外，还将考虑使用其他软件包来解决问题，尽管最好的答案也可以解决scipy的问题（例如，是用户错误还是我应该发布到github的错误？）。

Answer 1

使用nlopt可以解决此问题，这是一个让我印象深刻的非线性优化库。

首先，使用相同的函数定义目标函数和梯度：

def obj_func(x, grad):
    if grad.size > 0:
        grad[:] = obj_jac(x)
    return ( ( ( x/x0 - 1 )) ** 2 ).sum()

def obj_jac(x):
    return 2. * ( x - x0 ) / x0 ** 2

def constr_func(x, grad):
    if grad.size > 0:
        grad[:] = constr_jac(x)
    return x.sum() - target

def constr_jac(x):
    return np.ones(n)

然后，使用Nelder-Mead和SLSQP运行最小化：

opt = nlopt.opt(nlopt.LN_NELDERMEAD,len(x0)-1)
opt.set_min_objective(unconstr_func)
opt.set_ftol_abs(1e-15)
xopt = opt.optimize(x0[1:].copy())
xopt = np.hstack([target - xopt.sum(), xopt])
fval = opt.last_optimum_value()
print_res(xopt,fval,"Nelder-Mead");

opt = nlopt.opt(nlopt.LD_SLSQP,len(x0))
opt.set_min_objective(obj_func)
opt.add_equality_constraint(constr_func)
opt.set_ftol_abs(1e-15)
xopt = opt.optimize(x0.copy())
fval = opt.last_optimum_value()
print_res(xopt,fval,"SLSQP w/ jacobian");

结果如下：

 *****  Nelder-Mead  ***** 

obj func value at solution 0.00454545454546
result:  3
starting values:  [ 10000.  20000.  30000.  40000.  50000.]
ending values:    [10090 20363 30818 41454 52272]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0


 *****  SLSQP w/ jacobian  ***** 

obj func value at solution 0.00454545454545
result:  3
starting values:  [ 10000.  20000.  30000.  40000.  50000.]
ending values:    [10090 20363 30818 41454 52272]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0

测试时，我想我发现了最初尝试的问题所在。如果我将函数的绝对容差设置为1e-8，这就是scipy函数默认设置为：

 *****  Nelder-Mead  ***** 

obj func value at solution 0.0045454580693
result:  3
starting values:  [ 10000.  20000.  30000.  40000.  50000.]
ending values:    [10090 20363 30816 41454 52274]
% diff [0 1 2 3 4]
target achieved? 155000.0 155000.0


 *****  SLSQP w/ jacobian  ***** 

obj func value at solution 0.0146361108503
result:  3
starting values:  [ 10000.  20000.  30000.  40000.  50000.]
ending values:    [10999 21000 31000 41000 51000]
% diff [9 5 3 2 2]
target achieved? 155000.0 155000.0

这正是您所看到的。因此，我的猜测是，在SLSQP期间，最小化器最终会出现在似然空间中的某个位置，其中下一个跳转距离最后一个跳转小于1e-8。

Answer 2

这是对我在此处提出的问题的部分回答，以防止该问题变得更大，但我仍然希望看到更全面和解释性的答案。这些答案是基于另外两个人的评论，但是它们都没有完全写出代码，我认为将其明确表示是有意义的，这里是：

修复2a（与jacobian的trust-constr）

看来，关于雅可比和黑森州的关键是既不指定又不指定两者（但不只指定雅各比）。 @SubhaneilLahiri对此进行了评论，并且还出现了一条错误消息，我最初没有注意到：

UserWarning：delta_grad == 0.0。检查近似函数是否为线性。如果该函数是线性的，则可以通过将Hessian定义为零而不是使用拟牛顿逼近来获得更好的结果。

因此，我通过定义粗麻布功能对其进行了修复：

def constr_hess(x,v):
    return np.zeros([n,n])

并将其添加到约束中

nonlin_con = NonlinearConstraint( constr_func, 0., 0., constr_jac, constr_hess )

修复3a和3b（SLSQP）

这似乎是使公差减小的问题，如@ user545424所建议。所以我只添加了options={'ftol':1e-15}到最小化：

resSQjac = minimize( obj_func, x0, method='SLSQP',
                     options={'ftol':1e-15},
                     jac = obj_jac, constraints = eq_cons )

使用SciPy

2 个答案:

修复2a（与jacobian的trust-constr）

修复3a和3b（SLSQP）