从statsmodels检索模型估计

时间:2018-01-30 13:27:49

标签: python statsmodels

来自这样的数据集:

import pandas as pd
import numpy as np
import statsmodels.api as sm

# A dataframe with two variables
np.random.seed(123)
rows = 12
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, 2)), columns=['y', 'x']) 
df = df.set_index(rng)

enter image description here

......和这样的线性回归模型:

x = sm.add_constant(df['x'])
model = sm.OLS(df['y'], x).fit()

...您可以通过这种方式轻松检索一些模型系数:

print(model.params)

enter image description here

但我无法找到如何从模型摘要中检索所有其他参数:

print(str(model.summary()))

enter image description here

如问题中所述,我对 R-squared 特别感兴趣。

从帖子How to extract a particular value from the OLS-summary in Pandas?我了解到你可以使用print(model.r2)在那里做同样的事情。但这似乎不适用于statsmodels。

有什么建议吗?

2 个答案:

答案 0 :(得分:6)

你可以得到R平方,如:

代码:

model.rsquared

测试代码:

import pandas as pd
import numpy as np
import statsmodels.api as sm

# A dataframe with two variables
np.random.seed(123)
rows = 12
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, 2)), columns=['y', 'x'])
df = df.set_index(rng)

x = sm.add_constant(df['x'])
model = sm.OLS(df['y'], x).fit()

print(model.params)
print(model.rsquared)
print(str(model.summary()))

结果:

const    176.636417
x         -0.357185
dtype: float64

0.338332793094

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.338
Model:                            OLS   Adj. R-squared:                  0.272
Method:                 Least Squares   F-statistic:                     5.113
Date:                Tue, 30 Jan 2018   Prob (F-statistic):             0.0473
Time:                        05:36:04   Log-Likelihood:                -41.442
No. Observations:                  12   AIC:                             86.88
Df Residuals:                      10   BIC:                             87.85
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        176.6364     20.546      8.597      0.000     130.858     222.415
x             -0.3572      0.158     -2.261      0.047      -0.709      -0.005
==============================================================================
Omnibus:                        1.934   Durbin-Watson:                   1.182
Prob(Omnibus):                  0.380   Jarque-Bera (JB):                1.010
Skew:                          -0.331   Prob(JB):                        0.603
Kurtosis:                       1.742   Cond. No.                     1.10e+03
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.1e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

查找所有属性名称:

只需一小段代码:

for attr in dir(model):
    if not attr.startswith('_'):
        print(attr)

您可以看到对象上的所有属性:

HC0_se
HC1_se
HC2_se
HC3_se
aic
bic
bse
centered_tss
compare_f_test
compare_lm_test
compare_lr_test
condition_number
conf_int
conf_int_el
cov_HC0
cov_HC1
cov_HC2
cov_HC3
cov_kwds
cov_params
cov_type
df_model
df_resid
eigenvals
el_test
ess
f_pvalue
f_test
fittedvalues
fvalue
get_influence
get_prediction
get_robustcov_results
initialize
k_constant
llf
load
model
mse_model
mse_resid
mse_total
nobs
normalized_cov_params
outlier_test
params
predict
pvalues
remove_data
resid
resid_pearson
rsquared
rsquared_adj
save
scale
ssr
summary
summary2
t_test
tvalues
uncentered_tss
use_t
wald_test
wald_test_terms
wresid

答案 1 :(得分:0)

您可以使用服饰,例如

  • model.f_pvalue获取F统计量的p值
  • model.rsquared获取模型等的rsquared值。

请参阅https://www.statsmodels.org/devel/generated/statsmodels.regression.linear_model.RegressionResults.html

中的文档

样本用法: Jupyter screenshot of attributes