如何建立循环以在数据帧python中执行回归

时间:2016-10-22 20:41:29

标签: loops regression

我目前正在对来自一个数据框(称为df)的不同因变量进行回归分析。我想知道如何制作一个循环,因为我做了大约48次回归。此函数的非循环版本如下:

agric_ff = ols(formula = 'agric ~ prem + smb + hml', data=df).fit()
agric_ff_df = pd.DataFrame({'params': agric_ff.params})    
agric_ff_df.columns = ['agric']

food_ff = ols(formula = 'food ~ prem + smb + hml', data=df).fit()
food_ff_df = pd.DataFrame({'params': food_ff.params})    
food_ff_df.columns = ['food']

soda_ff = ols(formula = 'soda ~ prem + smb + hml', data=df).fit()
soda_ff_df = pd.DataFrame({'params': soda_ff.params})    
soda_ff_df.columns = ['soda']

beer_ff = ols(formula = 'beer ~ prem + smb + hml', data=df).fit()
beer_ff_df = pd.DataFrame({'beer': beer_ff.params})    
beer_ff_df.columns = ['beer']

smoke_ff = ols(formula = 'smoke ~ prem + smb + hml', data=df).fit()
smoke_ff_df = pd.DataFrame({'smoke': smoke_ff.params})    
smoke_ff_df.columns = ['smoke']

toys_ff = ols(formula = 'toys ~ prem + smb + hml', data=df).fit()
toys_ff_df = pd.DataFrame({'toys': toys_ff.params})    
toys_ff_df.columns = ['toys']

fun_ff = ols(formula = 'fun~ prem + smb + hml', data=df).fit()
fun_ff_df = pd.DataFrame({'fun': fun_ff.params})    
fun_ff_df.columns = ['fun']

books_ff = ols(formula = 'books ~ prem + smb + hml', data=df).fit()
books_ff_df = pd.DataFrame({'books': fun_ff.params})    
books_ff_df.columns = ['books']

非常感谢您的帮助

1 个答案:

答案 0 :(得分:0)

以下是使用公式表示法的一种方法:

import statsmodels.regression.linear_model as sm
import pandas as pd
from sklearn import datasets  # load a dummy dataset

# build a model using 4 columns, regressed on 4 others
boston = pd.DataFrame(boston.data, columns = boston.feature_names)
boston.head()


        CRIM     ZN         INDUS   CHAS NOX    RM      AGE     DIS     RAD TAX         PTRATIO  B       LSTAT
0       0.00632  18.0       2.31    0.0  0.538  6.575   65.2    4.0900  1.0 296.0       15.3     396.90  4.98
1       0.02731  0.0        7.07    0.0  0.469  6.421   78.9    4.9671  2.0 242.0       17.8     396.90  9.14
2       0.02729  0.0        7.07    0.0  0.469  7.185   61.1    4.9671  2.0 242.0       17.8     392.83  4.03
3       0.03237  0.0        2.18    0.0  0.458  6.998   45.8    6.0622  3.0 222.0       18.7     394.63  2.94
4       0.06905  0.0        2.18    0.0  0.458  7.147   54.2    6.0622  3.0 222.0       18.7     396.90  5.33

list_of_responses = ["LSTAT","RM","RAD","B"]

# list of models
models = []

for resp in list_of_responses:
    formula = resp + " ~ CRIM + ZN + INDUS + NOX"
    models.append(sm.OLS.from_formula(formula, data = boston).fit())

# each element is your model. For example, you can access its params
models[0].params