CPLEX Python API性能开销?

时间:2013-03-08 15:38:29

标签: python performance cplex

更新

这个问题已经在OR exchange上进行了彻底的讨论和更新,我在那里交叉了它。

原始问题

从命令行运行CPLEX 12.5.0.0时:

cplex -f my_instance.lp

最佳整数解在19056.99滴答中找到。

但是通过Python API,在同一个实例上:

import cplex
problem = cplex.Cplex("my_instance.lp")
problem.solve()

现在所需时间为97407.10蜱(慢5倍以上)。

在这两种情况下,模式都是并行的,确定性的,最多2个线程。想知道这个糟糕的性能是否是由于某些Python线程开销造成的,我试过了:

problem = cplex.Cplex("my_instance.lp")
problem.parameters.threads.set(1)
problem.solve()

需要46513.04个刻度(即使用一个核心比使用两个核心快两倍!)。

一般来说,对于CPLEX和LP来说,我发现这些结果非常令人困惑。有没有办法提高Python API性能,还是应该切换到更成熟的API(即Java或C ++)?

附件

以下是2线程分辨率的完整细节,首先是(常见)序言:

Tried aggregator 3 times.
MIP Presolve eliminated 2648 rows and 612 columns.
MIP Presolve modified 62 coefficients.
Aggregator did 13 substitutions.
Reduced MIP has 4229 rows, 1078 columns, and 13150 nonzeros.
Reduced MIP has 1071 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (18.79 ticks)
Probing fixed 24 vars, tightened 0 bounds.
Probing time = 0.08 sec. (18.12 ticks)
Tried aggregator 1 time.
MIP Presolve eliminated 87 rows and 26 columns.
MIP Presolve modified 153 coefficients.
Reduced MIP has 4142 rows, 1052 columns, and 12916 nonzeros.
Reduced MIP has 1045 binaries, 7 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.05 sec. (11.67 ticks)
Probing time = 0.01 sec. (1.06 ticks)
Clique table members: 4199.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 2 threads.
Root relaxation solution time = 0.20 sec. (91.45 ticks)

命令行的结果:

GUB cover cuts applied:  1
Clique cuts applied:  3
Cover cuts applied:  2
Implied bound cuts applied:  38
Zero-half cuts applied:  7
Gomory fractional cuts applied:  2

Root node processing (before b&c):
  Real time             =    5.27 sec. (2345.14 ticks)
Parallel b&c, 2 threads:
  Real time             =   35.15 sec. (16626.69 ticks)
  Sync time (average)   =    0.00 sec.
  Wait time (average)   =    0.00 sec.
                          ------------
Total (root+branch&cut) =   40.41 sec. (18971.82 ticks)

Python API的结果:

Clique cuts applied:  33
Cover cuts applied:  1
Implied bound cuts applied:  4
Zero-half cuts applied:  10
Gomory fractional cuts applied:  4

Root node processing (before b&c):
  Real time             =    6.42 sec. (2345.36 ticks)
Parallel b&c, 2 threads:
  Real time             =  222.28 sec. (95061.73 ticks)
  Sync time (average)   =    0.01 sec.
  Wait time (average)   =    0.00 sec.
                          ------------
Total (root+branch&cut) =  228.70 sec. (97407.10 ticks)

2 个答案:

答案 0 :(得分:1)

您可以尝试在两种情况下禁用预推和切割。然后重新运行实验以测试Python API本身是否会限制性能。如果在禁用剪切后性能匹配,那么请查看Python剪切参数调整&默认值。

在我看来,C ++是性能的首选,但它可以增加开发的时间。只是我的意见。

答案 1 :(得分:0)

与此相关,我注意到在调用variables.add和linear_constraints.add之后,python API需要相当长的时间来构建问题。似乎从CPXLgetcolindex调用的strcmp占用了大部分的配置文件,也许字符串ID是通过数组线性搜索来处理的?在C ++中,问题创建是即时的。