从统计模型摘要生成HTML

时间:2016-07-08 07:09:13

标签: multiple-regression statsmodels

我有以下代码来建模回归并将摘要打印到日志文件

#Finding the model fit using the multiple regression
        fit = smf.ols(self.formula_string,  data=df_train).fit()

        fit_parameters = str(fit.params)
        fit_summary = str(fit.summary())
        logger.info('fit_summary' + fit_summary)

我们知道摘要有一个表后面跟一个网格。摘要的网格部分(下面的示例图像中的蓝色)是否可以转换为HTML文件?

enter image description here

1 个答案:

答案 0 :(得分:2)

OLS的summary是从3个单独的表构建的。每个表都可以单独转换为字符串/文本,html或latex

res是由以下

中的fit方法返回的OLS结果实例
>>> summ = res.summary()
>>> dir(summ)
    ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', 
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', 
'__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', 
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', 
'__subclasshook__', '__weakref__', '_repr_html_', 'add_extra_txt', 
'add_table_2cols', 'add_table_params', 'as_csv', 'as_html', 'as_latex', 
'as_text', 'extra_txt', 'tables']

>>> len(summ.tables)
3
>>> summ.tables[1].as_html()
'<table class="simpletable">\n<tr>\n        <td></td>          <th>coef</th>     <th>std err</th>      <th>t</th>      <th>P>|t|</th>  <th>[0.025</th>    <th>0.975]</th>  \n</tr>\n<tr>\n  <th>C(Region)[C]</th> <td>   38.6517</td> <td>    9.456</td> <td>    4.087</td> <td> 0.000</td> <td>   19.826</td> <td>   57.478</td>\n</tr>\n<tr>\n  <th>C(Region)[E]</th> <td>   23.2239</td> <td>   14.931</td> <td>    1.555</td> <td> 0.124</td> <td>   -6.501</td> <td>   52.949</td>\n</tr>\n<tr>\n  <th>C(Region)[N]</th> <td>   28.6347</td> <td>   13.127</td> <td>    2.181</td> <td> 0.032</td> <td>    2.501</td> <td>   54.769</td>\n</tr>\n<tr>\n  <th>C(Region)[S]</th> <td>   34.1034</td> <td>   10.370</td> <td>    3.289</td> <td> 0.002</td> <td>   13.459</td> <td>   54.748</td>\n</tr>\n<tr>\n  <th>C(Region)[W]</th> <td>   28.5604</td> <td>   10.018</td> <td>    2.851</td> <td> 0.006</td> <td>    8.616</td> <td>   48.505</td>\n</tr>\n<tr>\n  <th>Literacy</th>     <td>   -0.1858</td> <td>    0.210</td> <td>   -0.886</td> <td> 0.378</td> <td>   -0.603</td> <td>    0.232</td>\n</tr>\n<tr>\n  <th>Wealth</th>       <td>    0.4515</td> <td>    0.103</td> <td>    4.390</td> <td> 0.000</td> <td>    0.247</td> <td>    0.656</td>\n</tr>\n</table>'

>>> print(summ.tables[1])
================================================================================
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
C(Region)[C]    38.6517      9.456      4.087      0.000      19.826      57.478
C(Region)[E]    23.2239     14.931      1.555      0.124      -6.501      52.949
C(Region)[N]    28.6347     13.127      2.181      0.032       2.501      54.769
C(Region)[S]    34.1034     10.370      3.289      0.002      13.459      54.748
C(Region)[W]    28.5604     10.018      2.851      0.006       8.616      48.505
Literacy        -0.1858      0.210     -0.886      0.378      -0.603       0.232
Wealth           0.4515      0.103      4.390      0.000       0.247       0.656
================================================================================