我正在尝试实现类似这样的东西
http://office.microsoft.com/en-us/excel-help/using-solver-to-rate-sports-teams-HA001124601.aspx
仅在python中使用python库(不调用Excel解算器)。
有人可以指向正确的图书馆使用+一些潜水教程来开始吗?
答案 0 :(得分:21)
您正在寻找NumPy(矩阵操作和数字运算)和SciPy(优化)。 要开始使用,请参阅https://stackoverflow.com/questions/4375094/numpy-learning-resources
我按照以下方式制定了给定的例子:
然后在Python中:
import csv
import numpy
import scipy.optimize
def readCsvFile(fname):
with open(fname, 'r') as inf:
return list(csv.reader(inf))
# Get team data
team = readCsvFile('teams.csv') # list of num,name
numTeams = len(team)
# Get game data
game = readCsvFile('games.csv') # list of game,home,away,homescore,awayscore
numGames = len(game)
# Now, we have the NFL teams for 2002 and data on all games played.
# From this, we wish to forecast the score of future games.
# We are going to assume that each team has an inherent performance-factor,
# and that there is a bonus for home-field advantage; then the
# relative final score between a home team and an away team can be
# calculated as (home advantage) + (home team factor) - (away team factor)
# First we create a matrix M which will hold the data on
# who played whom in each game and who had home-field advantage.
m_rows = numTeams + 1
m_cols = numGames
M = numpy.zeros( (m_rows, m_cols) )
# Then we create a vector S which will hold the final
# relative scores for each game.
s_cols = numGames
S = numpy.zeros(s_cols)
# Loading M and S with game data
for col,gamedata in enumerate(game):
gameNum,home,away,homescore,awayscore = gamedata
# In the csv data, teams are numbered starting at 1
# So we let home-team advantage be 'team 0' in our matrix
M[0, col] = 1.0 # home team advantage
M[int(home), col] = 1.0
M[int(away), col] = -1.0
S[col] = int(homescore) - int(awayscore)
# Now, if our theoretical model is correct, we should be able
# to find a performance-factor vector W such that W*M == S
#
# In the real world, we will never find a perfect match,
# so what we are looking for instead is W which results in S'
# such that the least-mean-squares difference between S and S'
# is minimized.
# Initial guess at team weightings:
# 2.0 points home-team advantage, and all teams equally strong
init_W = numpy.array([2.0]+[0.0]*numTeams)
def errorfn(w,m,s):
return w.dot(m) - s
W = scipy.optimize.leastsq(errorfn, init_W, args=(M,S))
homeAdvantage = W[0][0] # 2.2460937500005356
teamStrength = W[0][1:] # numpy.array([-151.31111318, -136.36319652, ... ])
# Team strengths have meaning only by linear comparison;
# we can add or subtract any constant to all of them without
# changing the meaning.
# To make them easier to understand, we want to shift them
# such that the average is 0.0
teamStrength -= teamStrength.mean()
for t,s in zip(team,teamStrength):
print "{0:>10}: {1: .7}".format(t[1],s)
结果
Ari: -9.8897569
Atl: 5.0581597
Balt: -2.1178819
Buff: -0.27413194
Carolina: -3.2720486
Chic: -5.2654514
Cinn: -10.503646
Clev: 1.2338542
Dall: -8.4779514
Den: 4.8901042
Det: -9.1727431
GB: 3.5800347
Hous: -9.4390625
Indy: 1.1689236
Jack: -0.2015625
KC: 6.1112847
Miami: 6.0588542
Minn: -3.0092014
NE: 4.0262153
NO: 2.4251736
NYG: 0.82725694
NYJ: 3.1689236
Oak: 10.635243
Phil: 8.2987847
Pitt: 2.6994792
St. Louis: -3.3352431
San Diego: -0.72065972
SF: 0.63524306
Seattle: -1.2512153
Tampa: 8.8019097
Tenn: 1.7640625
Wash: -4.4529514
与电子表格中显示的结果相同。
答案 1 :(得分:3)
此页面列出了许多可能使用的Python解算器库:
答案 2 :(得分:2)
PuLP 是python中的线性编程建模器。它可以完成excel解算器可以执行的所有操作。
PuLP是一个用Python编写的免费开源软件。它习惯了 将优化问题描述为数学模型。然后PuLP可以 呼叫众多外部LP解算器中的任何一个(CBC,GLPK,CPLEX,Gurobi 等)解决这个模型然后用python命令来操作 并显示解决方案。
有一个detailed introduction about PuLP和一本关于如何在python中使用PuLP建模优化问题的手册。
建模示例
# Import PuLP modeler functions
from pulp import *
# Create the 'prob' variable to contain the problem data
prob = LpProblem("Example_Problem", LpMinimize)
# Declare decision variables
var_x = LpVariable(name="x", lowBound=0, cat="Continuous")
var_y = LpVariable(name="y", cat="Integer")
# The objective function is added to 'prob' first
prob += var_x + 2 * var_y
# The constraints are added to 'prob'
prob += var_x == (-1) * var_y
prob += var_x <= 15
prob += var_x > 0
# The problem is solved using PuLP's choice of Solver
prob.solve()
# The status of the solution is printed to the screen
print("Status:", LpStatus[prob.status])
# Each of the variables is printed with it's resolved optimum value
for v in prob.variables():
print(v.name, "=", v.varValue)
答案 3 :(得分:0)
您可能需要考虑使用完全用Python编写的电子表格应用程序Pyspread。单个单元格可以包含Python表达式,并且可以访问所有Python模块。