如何用fmin最小化预测误差?

时间:2016-06-29 22:52:02

标签: python python-2.7 scipy

我试图通过选择正确的“跌落率”(r)来最小化预测误差。我仍然是Pandas的新手,也是SciPy的新手。请帮忙!

import pandas as pd
from scipy.optimize import fmin

data = pd.DataFrame({'Division': [1,2,3]*3,
                     'Month': ['May','May','May','June','June','Jun','Jul','Jul','Jul'],
                     'Definite_Units':[8]*9,
                     'Maybe_Units':[3,2,1]*3,
                     'Actually_Shipped_Units':[9]*9})

p = lambda r,x,y: x+y*r
e = lambda r,x,y,z: abs(1-(p(x,y,r)/z))

x = div_data['Definite_Units'].sum
y = div_data['Maybe_Units'].sum
z = div_data['Actually_Shipped_Units'].sum

for d in range(1,4):
    r0 = 1
    div_data = data['Division']=d
    x = div_data['Definite_Units'].sum()
    y = div_data['Maybe_Units'].sum()
    z = div_data['Actually_Shipped_Units'].sum()
    t = fmin(e,r0,args=(x,y,z))
    print d, t

我希望每个部门都有一个r,以最小化e。

所以在这种情况下我的输出应该是:

  • 分区1:r = 0.33,e = 0
  • 分区2:r = 0.50,e = 0
  • 第3部分:r = 1.00,e = 0

1 个答案:

答案 0 :(得分:0)

所以我在这个项目中学到了一些关于fmin的东西:

-Arguments必须是数组格式,所以我做了return_array辅助函数。

- 要优化的变量必须首先列在要最小化的函数中。所以对我来说它必须是e(r,c,u,s),而不是e(c,u,s,r)。

#calculate new fall out rates with fmin
import numpy as np
import pandas as pd
from scipy.optimize import fmin

data = pd.DataFrame({'DIV': [1,2,3]*3,
                     'MONTH': ['May','May','May','June','June','Jun','Jul','Jul','Jul'],
                     'C':[8]*9,
                     'U':[3,2,1]*3,
                     'S':[9]*9})

data.to_csv(r'C:\Users\mbabski\Documents\Unit Plan Summer 2016\data_test.csv')

def return_array(x):
    return x.values

def mape(c,u,s,r): #returns an array of line level Mean Absolute Percentage Errors
    p = c + u * r #calculates the forecasted number number
    m = abs(1.0-(p/s)) #calculates the MAPE at the line level
    return m

def e(r,c,u,s): #calculates average of the MAPEs
    return np.mean(mape(c,u,s,r)) 

for d in range(1,4):
    div_data = data[data.DIV==d]
    c = return_array(div_data.C)
    u = return_array(div_data.U)
    s = return_array(div_data.S)
    r0 = [[1.0]]
    t = fmin(e,r0,args=(c,u,s))
    print 'r:',t
  

优化成功终止。
           当前功能值:0.000011
           迭代次数:16
           功能评估:32
r:[0.33330078]
优化成功终止            当前功能值:0.000000
           迭代次数:15
           功能评估:30
r:[0.5]
优化成功终止            当前功能值:0.000000
           迭代次数:10
           功能评估:20
r:[1。]