Python中的Excel解算器

时间:2011-01-08 14:28:17

标签: python

我正在尝试实现类似这样的东西

http://office.microsoft.com/en-us/excel-help/using-solver-to-rate-sports-teams-HA001124601.aspx

仅在python中使用python库(不调用Excel解算器)。

有人可以指向正确的图书馆使用+一些潜水教程来开始吗?

4 个答案:

答案 0 :(得分:21)

您正在寻找NumPy(矩阵操作和数字运算)和SciPy(优化)。 要开始使用,请参阅https://stackoverflow.com/questions/4375094/numpy-learning-resources

我按照以下方式制定了给定的例子:

  • 我在OpenOffice中打开了示例Excel文件
  • 我将团队数据(没有标题)复制到新工作表并保存为teams.csv
  • 我将游戏数据(没有标题)复制到新工作表并保存为games.csv

然后在Python中:

import csv
import numpy
import scipy.optimize

def readCsvFile(fname):
    with open(fname, 'r') as inf:
        return list(csv.reader(inf))

# Get team data
team = readCsvFile('teams.csv')  # list of num,name
numTeams = len(team)

# Get game data
game = readCsvFile('games.csv')  # list of game,home,away,homescore,awayscore
numGames = len(game)

# Now, we have the NFL teams for 2002 and data on all games played.
# From this, we wish to forecast the score of future games.
# We are going to assume that each team has an inherent performance-factor,
# and that there is a bonus for home-field advantage; then the
# relative final score between a home team and an away team can be
# calculated as (home advantage) + (home team factor) - (away team factor)

# First we create a matrix M which will hold the data on
# who played whom in each game and who had home-field advantage.
m_rows = numTeams + 1
m_cols = numGames
M = numpy.zeros( (m_rows, m_cols) )

# Then we create a vector S which will hold the final
# relative scores for each game.
s_cols = numGames
S = numpy.zeros(s_cols)

# Loading M and S with game data
for col,gamedata in enumerate(game):
    gameNum,home,away,homescore,awayscore = gamedata
    # In the csv data, teams are numbered starting at 1
    # So we let home-team advantage be 'team 0' in our matrix
    M[0, col]         =  1.0   # home team advantage
    M[int(home), col] =  1.0
    M[int(away), col] = -1.0
    S[col]            = int(homescore) - int(awayscore)


# Now, if our theoretical model is correct, we should be able
# to find a performance-factor vector W such that W*M == S
#
# In the real world, we will never find a perfect match,
# so what we are looking for instead is W which results in S'
# such that the least-mean-squares difference between S and S'
# is minimized.

# Initial guess at team weightings:
# 2.0 points home-team advantage, and all teams equally strong
init_W = numpy.array([2.0]+[0.0]*numTeams)  

def errorfn(w,m,s):
    return w.dot(m) - s

W = scipy.optimize.leastsq(errorfn, init_W, args=(M,S))

homeAdvantage = W[0][0]   # 2.2460937500005356
teamStrength = W[0][1:]   # numpy.array([-151.31111318, -136.36319652, ... ])

# Team strengths have meaning only by linear comparison;
# we can add or subtract any constant to all of them without
# changing the meaning.
# To make them easier to understand, we want to shift them
# such that the average is 0.0
teamStrength -= teamStrength.mean()

for t,s in zip(team,teamStrength):
    print "{0:>10}: {1: .7}".format(t[1],s)

结果

       Ari: -9.8897569
       Atl:  5.0581597
      Balt: -2.1178819
      Buff: -0.27413194
  Carolina: -3.2720486
      Chic: -5.2654514
      Cinn: -10.503646
      Clev:  1.2338542
      Dall: -8.4779514
       Den:  4.8901042
       Det: -9.1727431
        GB:  3.5800347
      Hous: -9.4390625
      Indy:  1.1689236
      Jack: -0.2015625
        KC:  6.1112847
     Miami:  6.0588542
      Minn: -3.0092014
        NE:  4.0262153
        NO:  2.4251736
       NYG:  0.82725694
       NYJ:  3.1689236
       Oak:  10.635243
      Phil:  8.2987847
      Pitt:  2.6994792
 St. Louis: -3.3352431
 San Diego: -0.72065972
        SF:  0.63524306
   Seattle: -1.2512153
     Tampa:  8.8019097
      Tenn:  1.7640625
      Wash: -4.4529514

与电子表格中显示的结果相同。

答案 1 :(得分:3)

此页面列出了许多可能使用的Python解算器库:

答案 2 :(得分:2)

PuLP 是python中的线性编程建模器。它可以完成excel解算器可以执行的所有操作。

  

PuLP是一个用Python编写的免费开源软件。它习惯了   将优化问题描述为数学模型。然后PuLP可以   呼叫众多外部LP解算器中的任何一个(CBC,GLPK,CPLEX,Gurobi   等)解决这个模型然后用python命令来操作   并显示解决方案。

有一个detailed introduction about PuLP和一本关于如何在python中使用PuLP建模优化问题的手册。

建模示例

# Import PuLP modeler functions
from pulp import *

# Create the 'prob' variable to contain the problem data
prob = LpProblem("Example_Problem", LpMinimize)

# Declare decision variables
var_x = LpVariable(name="x", lowBound=0, cat="Continuous")
var_y = LpVariable(name="y", cat="Integer")

# The objective function is added to 'prob' first
prob += var_x + 2 * var_y

# The constraints are added to 'prob'
prob += var_x == (-1) * var_y
prob += var_x <= 15
prob += var_x > 0

# The problem is solved using PuLP's choice of Solver
prob.solve()

# The status of the solution is printed to the screen
print("Status:", LpStatus[prob.status])

# Each of the variables is printed with it's resolved optimum value
for v in prob.variables():
    print(v.name, "=", v.varValue)

答案 3 :(得分:0)

您可能需要考虑使用完全用Python编写的电子表格应用程序Pyspread。单个单元格可以包含Python表达式,并且可以访问所有Python模块。