我有一个m
行和n
列的标量数组。我有一个Variable(m)
和一个Variable(n)
,我想找到解决方法。
这两个变量代表需要分别在列和行上广播的值。
我很天真地想将变量写为Variable((m, 1))
和Variable((1, n))
,然后将它们加在一起,就好像它们是ndarrays一样。但是,这是行不通的,因为不允许广播。
import cvxpy as cp
import numpy as np
# Problem data.
m = 3
n = 4
np.random.seed(1)
data = np.random.randn(m, n)
# Construct the problem.
x = cp.Variable((m, 1))
y = cp.Variable((1, n))
objective = cp.Minimize(cp.sum(cp.abs(x + y + data)))
# or:
#objective = cp.Minimize(cp.sum_squares(x + y + data))
prob = cp.Problem(objective)
result = prob.solve()
print(x.value)
print(y.value)
这在x + y
表达式:ValueError: Cannot broadcast dimensions (3, 1) (1, 4)
上失败。
现在我想知道两件事:
我对凸优化和cvxpy的概念还很陌生,希望我对问题的描述足够好。
答案 0 :(得分:1)
我愿意向您展示如何将其表示为线性程序,所以就到这里了。我正在使用Pyomo,因为我对此比较熟悉,但是您可以在PuLP中做类似的事情。
要运行此程序,您需要首先安装Pyomo和类似glpk的线性程序求解器。 glpk应该可以解决合理大小的问题,但是如果发现解决问题的时间太长,可以尝试使用CPLEX或Gurobi之类的(更快)商业解决方案。
您可以通过pip install pyomo
或conda install -c conda-forge pyomo
安装Pyomo。您可以从https://www.gnu.org/software/glpk/或通过conda install glpk
安装glpk。 (我认为PuLP带有内置的glpk版本,因此可以节省您一步。)
这是脚本。请注意,这通过为误差的正分量定义一个变量,为负数定义另一个变量来将绝对误差计算为线性表达式。然后,它试图使两者之和最小。在这种情况下,求解器将始终将其设置为零,因为这是减少误差的一种简便方法,然后另一个将等于绝对误差。
import random
import pyomo.environ as po
random.seed(1)
# ~50% sparse data set, big enough to populate every row and column
m = 10 # number of rows
n = 10 # number of cols
data = {
(r, c): random.random()
for r in range(m)
for c in range(n)
if random.random() >= 0.5
}
# define a linear program to find vectors
# x in R^m, y in R^n, such that x[r] + y[c] is close to data[r, c]
# create an optimization model object
model = po.ConcreteModel()
# create indexes for the rows and columns
model.ROWS = po.Set(initialize=range(m))
model.COLS = po.Set(initialize=range(n))
# create indexes for the dataset
model.DATAPOINTS = po.Set(dimen=2, initialize=data.keys())
# data values
model.data = po.Param(model.DATAPOINTS, initialize=data)
# create the x and y vectors
model.X = po.Var(model.ROWS, within=po.NonNegativeReals)
model.Y = po.Var(model.COLS, within=po.NonNegativeReals)
# create dummy variables to represent errors
model.ErrUp = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)
model.ErrDown = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)
# Force the error variables to match the error
def Calculate_Error_rule(model, r, c):
pred = model.X[r] + model.Y[c]
err = model.ErrUp[r, c] - model.ErrDown[r, c]
return (model.data[r, c] + err == pred)
model.Calculate_Error = po.Constraint(
model.DATAPOINTS, rule=Calculate_Error_rule
)
# Minimize the total error
def ClosestMatch_rule(model):
return sum(
model.ErrUp[r, c] + model.ErrDown[r, c]
for (r, c) in model.DATAPOINTS
)
model.ClosestMatch = po.Objective(
rule=ClosestMatch_rule, sense=po.minimize
)
# Solve the model
# get a solver object
opt = po.SolverFactory("glpk")
# solve the model
# turn off "tee" if you want less verbose output
results = opt.solve(model, tee=True)
# show solution status
print(results)
# show verbose description of the model
model.pprint()
# show X and Y values in the solution
for r in model.ROWS:
print('X[{}]: {}'.format(r, po.value(model.X[r])))
for c in model.COLS:
print('Y[{}]: {}'.format(c, po.value(model.Y[c])))
仅是为了完成故事,这是一个更接近您原始示例的解决方案。它使用cvxpy
,但使用我的解决方案中的稀疏数据方法。
我不知道使用cvxpy进行元素计算的“官方”方法,但是似乎可以将标准的Python sum
函数与许多单独的cp.abs(...)
函数一起使用,这似乎是可以的。
这提供的解决方案比线性程序差一点,但是您可以通过调整解决方案公差来解决。
import cvxpy as cp
import random
random.seed(1)
# Problem data.
# ~50% sparse data set
m = 10 # number of rows
n = 10 # number of cols
data = {
(i, j): random.random()
for i in range(m)
for j in range(n)
if random.random() >= 0.5
}
# Construct the problem.
x = cp.Variable(m)
y = cp.Variable(n)
objective = cp.Minimize(
sum(
cp.abs(x[i] + y[j] + data[i, j])
for (i, j) in data.keys()
)
)
prob = cp.Problem(objective)
result = prob.solve()
print(x.value)
print(y.value)
答案 1 :(得分:0)
我没有这个主意,只是基于假设的一些骇人听闻的内容:
(m, 1) + (1, n)
上的广播规则行为那么麻木:
m = 3
n = 4
np.random.seed(1)
a = np.random.randn(m, 1)
b = np.random.randn(1, n)
a
array([[ 1.62434536],
[-0.61175641],
[-0.52817175]])
b
array([[-1.07296862, 0.86540763, -2.3015387 , 1.74481176]])
a + b
array([[ 0.55137674, 2.48975299, -0.67719333, 3.36915713],
[-1.68472504, 0.25365122, -2.91329511, 1.13305535],
[-1.60114037, 0.33723588, -2.82971045, 1.21664001]])
让我们用np.kron来模仿它,它有一个cvxpy-equivalent:
aLifted = np.kron(np.ones((1,n)), a)
bLifted = np.kron(np.ones((m,1)), b)
aLifted
array([[ 1.62434536, 1.62434536, 1.62434536, 1.62434536],
[-0.61175641, -0.61175641, -0.61175641, -0.61175641],
[-0.52817175, -0.52817175, -0.52817175, -0.52817175]])
bLifted
array([[-1.07296862, 0.86540763, -2.3015387 , 1.74481176],
[-1.07296862, 0.86540763, -2.3015387 , 1.74481176],
[-1.07296862, 0.86540763, -2.3015387 , 1.74481176]])
aLifted + bLifted
array([[ 0.55137674, 2.48975299, -0.67719333, 3.36915713],
[-1.68472504, 0.25365122, -2.91329511, 1.13305535],
[-1.60114037, 0.33723588, -2.82971045, 1.21664001]])
让我们半盲地检查cvxpy(我们只是尺寸;懒得设置问题并修复变量以检查输出:-D):
import cvxpy as cp
x = cp.Variable((m, 1))
y = cp.Variable((1, n))
cp.kron(np.ones((1,n)), x) + cp.kron(np.ones((m, 1)), y)
# Expression(AFFINE, UNKNOWN, (3, 4))
# looks good!
现在有一些警告: