Question

我有一个m行和n列的标量数组。我有一个Variable(m)和一个Variable(n)，我想找到解决方法。

这两个变量代表需要分别在列和行上广播的值。

我很天真地想将变量写为Variable((m, 1))和Variable((1, n))，然后将它们加在一起，就好像它们是ndarrays一样。但是，这是行不通的，因为不允许广播。

import cvxpy as cp
import numpy as np

# Problem data.
m = 3
n = 4
np.random.seed(1)
data = np.random.randn(m, n)

# Construct the problem.
x = cp.Variable((m, 1))
y = cp.Variable((1, n))

objective = cp.Minimize(cp.sum(cp.abs(x + y + data)))
# or:
#objective = cp.Minimize(cp.sum_squares(x + y + data))

prob = cp.Problem(objective)

result = prob.solve()
print(x.value)
print(y.value)

这在x + y表达式：ValueError: Cannot broadcast dimensions (3, 1) (1, 4)上失败。

现在我想知道两件事：

使用凸优化是否可以解决我的问题？
如果是，我该如何以cvxpy理解的方式表达它？

我对凸优化和cvxpy的概念还很陌生，希望我对问题的描述足够好。

Answer 1

我愿意向您展示如何将其表示为线性程序，所以就到这里了。我正在使用Pyomo，因为我对此比较熟悉，但是您可以在PuLP中做类似的事情。

要运行此程序，您需要首先安装Pyomo和类似glpk的线性程序求解器。 glpk应该可以解决合理大小的问题，但是如果发现解决问题的时间太长，可以尝试使用CPLEX或Gurobi之类的（更快）商业解决方案。

您可以通过pip install pyomo或conda install -c conda-forge pyomo安装Pyomo。您可以从https://www.gnu.org/software/glpk/或通过conda install glpk安装glpk。（我认为PuLP带有内置的glpk版本，因此可以节省您一步。）

这是脚本。请注意，这通过为误差的正分量定义一个变量，为负数定义另一个变量来将绝对误差计算为线性表达式。然后，它试图使两者之和最小。在这种情况下，求解器将始终将其设置为零，因为这是减少误差的一种简便方法，然后另一个将等于绝对误差。

import random
import pyomo.environ as po

random.seed(1)

# ~50% sparse data set, big enough to populate every row and column
m = 10   # number of rows
n = 10   # number of cols
data = {
    (r, c): random.random()
    for r in range(m)
    for c in range(n)
    if random.random() >= 0.5
}

# define a linear program to find vectors
# x in R^m, y in R^n, such that x[r] + y[c] is close to data[r, c]

# create an optimization model object
model = po.ConcreteModel()

# create indexes for the rows and columns
model.ROWS = po.Set(initialize=range(m))
model.COLS = po.Set(initialize=range(n))

# create indexes for the dataset
model.DATAPOINTS = po.Set(dimen=2, initialize=data.keys())

# data values
model.data = po.Param(model.DATAPOINTS, initialize=data)

# create the x and y vectors
model.X = po.Var(model.ROWS, within=po.NonNegativeReals)
model.Y = po.Var(model.COLS, within=po.NonNegativeReals)

# create dummy variables to represent errors
model.ErrUp = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)
model.ErrDown = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)

# Force the error variables to match the error
def Calculate_Error_rule(model, r, c):
    pred = model.X[r] + model.Y[c]
    err = model.ErrUp[r, c] - model.ErrDown[r, c]
    return (model.data[r, c] + err == pred)
model.Calculate_Error = po.Constraint(
    model.DATAPOINTS, rule=Calculate_Error_rule
)

# Minimize the total error
def ClosestMatch_rule(model):
    return sum(
        model.ErrUp[r, c] + model.ErrDown[r, c]
        for (r, c) in model.DATAPOINTS
    )
model.ClosestMatch = po.Objective(
    rule=ClosestMatch_rule, sense=po.minimize
)

# Solve the model

# get a solver object
opt = po.SolverFactory("glpk")
# solve the model
# turn off "tee" if you want less verbose output
results = opt.solve(model, tee=True)

# show solution status
print(results)

# show verbose description of the model
model.pprint()

# show X and Y values in the solution
for r in model.ROWS:
    print('X[{}]: {}'.format(r, po.value(model.X[r])))
for c in model.COLS:
    print('Y[{}]: {}'.format(c, po.value(model.Y[c])))

仅是为了完成故事，这是一个更接近您原始示例的解决方案。它使用cvxpy，但使用我的解决方案中的稀疏数据方法。

我不知道使用cvxpy进行元素计算的“官方”方法，但是似乎可以将标准的Python sum函数与许多单独的cp.abs(...)函数一起使用，这似乎是可以的。

这提供的解决方案比线性程序差一点，但是您可以通过调整解决方案公差来解决。

import cvxpy as cp
import random

random.seed(1)

# Problem data.
# ~50% sparse data set
m = 10   # number of rows
n = 10   # number of cols
data = {
    (i, j): random.random()
    for i in range(m)
    for j in range(n)
    if random.random() >= 0.5
}

# Construct the problem.
x = cp.Variable(m)
y = cp.Variable(n)

objective = cp.Minimize(
    sum(
        cp.abs(x[i] + y[j] + data[i, j])
        for (i, j) in data.keys()
    )
)

prob = cp.Problem(objective)

result = prob.solve()
print(x.value)
print(y.value)

Answer 2

我没有这个主意，只是基于假设的一些骇人听闻的内容：

您想要一些cvxpy等效于numpy在阵列(m, 1) + (1, n)上的广播规则行为

那么麻木：

m = 3
n = 4
np.random.seed(1)

a = np.random.randn(m, 1)
b = np.random.randn(1, n)

a
array([[ 1.62434536],
   [-0.61175641],
   [-0.52817175]])

b
array([[-1.07296862,  0.86540763, -2.3015387 ,  1.74481176]])


a + b
array([[ 0.55137674,  2.48975299, -0.67719333,  3.36915713],
   [-1.68472504,  0.25365122, -2.91329511,  1.13305535],
   [-1.60114037,  0.33723588, -2.82971045,  1.21664001]])

让我们用np.kron来模仿它，它有一个cvxpy-equivalent：

aLifted = np.kron(np.ones((1,n)), a)
bLifted = np.kron(np.ones((m,1)), b)

aLifted
array([[ 1.62434536,  1.62434536,  1.62434536,  1.62434536],
   [-0.61175641, -0.61175641, -0.61175641, -0.61175641],
   [-0.52817175, -0.52817175, -0.52817175, -0.52817175]])

bLifted
array([[-1.07296862,  0.86540763, -2.3015387 ,  1.74481176],
   [-1.07296862,  0.86540763, -2.3015387 ,  1.74481176],
   [-1.07296862,  0.86540763, -2.3015387 ,  1.74481176]])

aLifted + bLifted
array([[ 0.55137674,  2.48975299, -0.67719333,  3.36915713],
   [-1.68472504,  0.25365122, -2.91329511,  1.13305535],
   [-1.60114037,  0.33723588, -2.82971045,  1.21664001]])

让我们半盲地检查cvxpy（我们只是尺寸；懒得设置问题并修复变量以检查输出：-D）：

import cvxpy as cp
x = cp.Variable((m, 1))
y = cp.Variable((1, n))

cp.kron(np.ones((1,n)), x) + cp.kron(np.ones((m, 1)), y)
# Expression(AFFINE, UNKNOWN, (3, 4))

# looks good!

现在有一些警告：

我不知道cvxpy在内部如何推理这种矩阵形式
- 不清楚使用cp.vstack和co作为基于列表理解的简单形式是否更有效（可能是）
此操作本身会杀死所有稀疏文件
- （如果两个向量都是密集的；您的矩阵是密集的）
- cvxpy和几乎所有凸优化求解器都基于某些sparsity假设
  - 将此问题扩大到机器学习的范围并不会使您满意
针对您的问题，可能存在一个更简洁的数学理论，然后使用（稀疏假设）（漂亮）通用（在cvxpy中实现的DCP是一个子集）凸优化

我的问题适合凸优化吗？如果是，如何用cvxpy表示？

2 个答案: