我想定义一个函数,该函数在数据框的每一列与最后一列之间运行OLS模型。例如,我有一个包含13列的数据框,因此我必须运行12次OLS回归,而且编写起来太多了。
import pandas as pd
from sklearn import linear_model
DF = pd.read_excel('data.xlsx')
print(DF)
# Regression Model
for columns in DF:
reg = linear_model.LinearRegression()
reg.fit(DF[['INCOME']], DF.x)
reg1 = linear_model.LinearRegression()
reg1.fit(DF[['INCOME']], DF.FOOD)
reg2 = linear_model.LinearRegression()
reg2.fit(DF[['INCOME']], DF.SMOKING)
.
.
.
reg11 = linear_model.LinearRegression()
reg11.fit(DF[['INCOME']], DF.HOTEL)
reg12 = linear_model.LinearRegression()
reg12.fit(DF[['INCOME']], DF.OTHERS)
#Beta Coefficeints
B1 = reg1.coef_
B2 = reg2.coef_
.
B10 = reg10.coef_
B11 = reg11.coef_
B12 = reg12.coef_
print(B1)
print(B2)
.`
print(B10)
print(B11)
print(B12)
我只想使其更短
答案 0 :(得分:0)
您可以遍历该列并将结果存储在字典中,即:
from sklearn import linear_model
dict = {}
for i in df.columns:
reg = linear_model.LinearRegression()
reg.fit(df[['INCOME']], df[i])
dict[i] = reg.coef_
print(dict[i])