Question

我正在尝试在numpy中实现数值梯度计算，以用作cyipopt中渐变的回调函数。我对numpy梯度函数的理解是它应该返回基于finite different approximation的点计算的梯度。

我不明白如何用这个模块实现非线性函数的梯度。给出的样本问题似乎是线性函数。

>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> np.gradient(f)
array([ 1. ,  1.5,  2.5,  3.5,  4.5,  5. ])
>>> np.gradient(f, 2)
array([ 0.5 ,  0.75,  1.25,  1.75,  2.25,  2.5 ])

我的代码段如下：

import numpy as np

# Hock & Schittkowski test problem #40
x = np.mgrid[0.75:0.85:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01]
# target is evaluation at x = [0.8, 0.8, 0.8, 0.8]
f = -x[0] * x[1] * x[2] * x[3]
g = np.gradient(f)

print g

另一个缺点是我必须在几个点评估x（它会在几个点返回渐变） numpy / scipy中是否有更好的选项可以在单点进行数值计算，因此我可以将其作为回调函数实现？

Answer 1

首先，一些警告：

数值优化很难做得正确
ipopt是非常复杂的软件
- 将ipopt与数字微分声音相结合，就像你在寻找麻烦，但这当然取决于你的问题
- ipopt几乎总是基于automatic-differentiation tools而不是numerical-differentiation！

还有一些：

因为这是一项复杂的任务，并且python + ipopt的状态不像其他语言（例如julia + JuMP那样好），这有点工作

还有一些替代方案：

使用pyomo包装ipopt并具有自动区分
使用casadi也包含ipopt并具有自动区分
使用autograd自动计算numpy-code子集的渐变
- 然后使用cyipopt添加那些
scipy.minimize with solvers SLSQP or COBYLA 可以为您做一切（SLSQP可以使用相等和不等式约束; COBYLA只有不等式约束，其中x >= y + {{{ 1}} 可以工作）

使用您的工具接近您的任务

您的完整示例问题在Test Examples for Nonlinear Programming Codes中定义：

这里有一些代码，基于数值微分，解决你的测试问题，包括官方设置（函数，渐变，起点，边界......）

x <= y

输出：

import numpy as np
import scipy.sparse as sps
import ipopt
from scipy.optimize import approx_fprime


class Problem40(object):
    """ # Hock & Schittkowski test problem #40
            Basic structure  follows:
            - cyipopt example from https://pythonhosted.org/ipopt/tutorial.html#defining-the-problem
            - which follows ipopt's docs from: https://www.coin-or.org/Ipopt/documentation/node22.html
            Changes:
            - numerical-diff using scipy for function & constraints
            - removal of hessian-calculation
              - we will use limited-memory approximation
                - ipopt docs: https://www.coin-or.org/Ipopt/documentation/node31.html
              - (because i'm too lazy to reason about the math; lagrange and co.)
    """
    def __init__(self):
        self.num_diff_eps = 1e-8  # maybe tuning needed!

    def objective(self, x):
        # callback for objective
        return -np.prod(x)  # -x1 x2 x3 x4

    def constraint_0(self, x):
        return np.array([x[0]**3 + x[1]**2 -1])

    def constraint_1(self, x):
        return np.array([x[0]**2 * x[3] - x[2]])

    def constraint_2(self, x):
        return np.array([x[3]**2 - x[1]])

    def constraints(self, x):
        # callback for constraints
        return np.concatenate([self.constraint_0(x),
                               self.constraint_1(x),
                               self.constraint_2(x)])

    def gradient(self, x):
        # callback for gradient
        return approx_fprime(x, self.objective, self.num_diff_eps)

    def jacobian(self, x):
        # callback for jacobian
        return np.concatenate([
            approx_fprime(x, self.constraint_0, self.num_diff_eps),
            approx_fprime(x, self.constraint_1, self.num_diff_eps),
            approx_fprime(x, self.constraint_2, self.num_diff_eps)])

    def hessian(self, x, lagrange, obj_factor):
        return False  # we will use quasi-newton approaches to use hessian-info

    # progress callback
    def intermediate(
            self,
            alg_mod,
            iter_count,
            obj_value,
            inf_pr,
            inf_du,
            mu,
            d_norm,
            regularization_size,
            alpha_du,
            alpha_pr,
            ls_trials
            ):

        print("Objective value at iteration #%d is - %g" % (iter_count, obj_value))

# Remaining problem definition; still following official source:
# http://www.ai7.uni-bayreuth.de/test_problem_coll.pdf

# start-point -> infeasible
x0 = [0.8, 0.8, 0.8, 0.8]

# variable-bounds -> empty => np.inf-approach deviates from cyipopt docs!
lb = [-np.inf, -np.inf, -np.inf, -np.inf]
ub = [np.inf, np.inf, np.inf, np.inf]

# constraint bounds -> c == 0 needed -> both bounds = 0
cl = [0, 0, 0]
cu = [0, 0, 0]

nlp = ipopt.problem(
            n=len(x0),
            m=len(cl),
            problem_obj=Problem40(),
            lb=lb,
            ub=ub,
            cl=cl,
            cu=cu
            )

# IMPORTANT: need to use limited-memory / lbfgs here as we didn't give a valid hessian-callback
nlp.addOption(b'hessian_approximation', b'limited-memory')
x, info = nlp.solve(x0)
print(x)
print(info)

# CORRECT RESULT & SUCCESSFUL STATE

关于代码的备注

我们使用基本上为scipy's approx_fprime

scipy.optimize

如消息来源所述;我没有注意ipopt对粗麻布的需求，我们使用ipopts hessian-approximation
- 基本构思在wiki: LBFGS
我确实忽略了ipopts需要雅可比稀疏性的稀疏结构
- 一个默认假设：默认的hessian结构是一个较低的三角矩阵被使用，我不会对这里发生的事情提供任何保证（糟糕的性能与破坏一切）

Answer 2

我认为你对什么是数学函数以及它的数值实现有什么误解。

您应该将您的功能定义为：

def func(x1, x2, x3, x4):
    return -x1*x2*x3*x4

现在，您想要评估特定点的功能，您可以使用自己提供的np.mgrid进行评估。

如果您想计算渐变，请使用copy.misc.derivative（https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html）（注意dx的默认参数通常不好，请将其更改为1e-5。两者之间没有区别数值评估的线性和非线性梯度，只有非线性函数的梯度在任何地方都不一样。

您使用np.gradient所做的实际上是从数组中的点计算渐变，您的函数定义被f的定义隐藏，因此不允许多个渐变评估在不同的点。同样使用您的方法会使您依赖于您的离散化步骤。

numpy / scipy中非线性函数的数值梯度

2 个答案:

首先，一些警告：

还有一些：

还有一些替代方案：

使用您的工具接近您的任务

关于代码的备注