scipy.optimize.least_squares数据在哪里?

时间:2018-10-01 16:57:48

标签: python dataframe scipy linear-regression

我正在尝试对表单进行简单的多元回归

Y = b_1 * X_1 + b_2 * X_2 + b_3 * X_3 + e

有约束条件:

sum(beta) = 1
beta >= 0

我的输入数据如下

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(100,4)), 
          columns=['Historic Rate', 'Overnight', '1M','3M'])

Y = df['Historic Rate']
X = df['Overnight','1M','3M]

所以我希望像这样使用scipy.optimize.least_squares函数

scipy.optimize.least_squares(fun, bounds=(0,1),X)

其中X =我的自变量数据,并且函数定义为

Y - B1*X1 - B2*X2 - B3*X3

我不确定输入数据将在何处估算此OLS?

1 个答案:

答案 0 :(得分:1)

您的问题中的Beta是什么?假设beta应该是包含b1,...,b3的向量,这只是一个约束优化问题,可以通过scipy's minimize轻松解决,如下所示:

import pandas as pd
import numpy as np
from scipy.optimize import minimize

# Your Data
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(100,4)), columns=['Historic Rate', 'Overnight', '1M','3M'])
Y = np.array(df['Historic Rate'])
X = np.array(df[['Overnight','1M','3M']])

# Define the Model
model = lambda b, X: b[0] * X[:,0] + b[1] * X[:,1] + b[2] * X[:,2]

# The objective Function to minimize (least-squares regression)
obj = lambda b, Y, X: np.sum(np.abs(Y-model(b, X))**2)

# Bounds: b[0], b[1], b[2] >= 0
bnds = [(0, None), (0, None), (0, None)]

# Constraint: b[0] + b[1] + b[2] - 1 = 0
cons = [{"type": "eq", "fun": lambda b: b[0]+b[1]+b[2] - 1}]

# Initial guess for b[1], b[2], b[3]:
xinit = np.array([0, 0, 1])

res = minimize(obj, args=(Y, X), x0=xinit, bounds=bnds, constraints=cons)
print(f"b1={res.x[0]}, b2={res.x[1]}, b3={res.x[2]}")