比较Python中线性模型的对比(比如Rs对比库?)

时间:2015-12-11 19:21:42

标签: linear-regression statsmodels

在R中,我可以执行以下操作来比较线性模型中的两个对比:

Starting decorator
Traceback (most recent call last):
  File "C:\....\dec_example.py", line 29, in <module>
    Foo.bar()    
  File "C:\....\dec_example.py", line 4, in redefined_method
    method(args, **kwargs)
TypeError: 'classmethod' object is not callable

我已经发现如何在Python中完成上述操作:

url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/spider_wolff_gorb_2013.csv"
filename <- "spider_wolff_gorb_2013.csv"
install.packages("downloader", repos="http://cran.us.r-project.org")
library(downloader)
if (!file.exists(filename)) download(url, filename)
spider <- read.csv(filename, skip=1)
head(spider, 5)
#   leg type friction
# 1  L1 pull     0.90
# 2  L1 pull     0.91
# 3  L1 pull     0.86
# 4  L1 pull     0.85
# 5  L1 pull     0.80
fit = lm(friction ~ type + leg, data=spider)
fit
# Call:
# lm(formula = friction ~ type + leg, data = spider)
# 
# Coefficients:
# (Intercept)     typepush        legL2        legL3        legL4
#      1.0539      -0.7790       0.1719       0.1605       0.2813
install.packages("contrast", repos="http://cran.us.r-project.org")
library(contrast)
l4vsl2 = contrast(fit, list(leg="L4", type="pull"), list(leg="L2",type="pull"))
l4vsl2
# lm model parameter contrast
# 
#   Contrast       S.E.      Lower     Upper    t  df Pr(>|t|)
#  0.1094167 0.04462392 0.02157158 0.1972618 2.45 277   0.0148

现在剩下的就是找到腿对L4与腿对L2的对比度的t统计量。这在Python中是否可行?

1 个答案:

答案 0 :(得分:1)

statsmodels仍然缺少一些预定义的对比,但模型结果类的t_testwald_testf_test方法可用于测试线性(或仿射)限制。限制由数组或使用参数名称的字符串给出。

有关如何指定对比/限制的详细信息应在文档中

例如

>>> tt = fitted1.t_test("leg[T.L4] - leg[T.L2]")
>>> print(tt.summary())
                             Test for Constraints                             
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0             0.1094      0.045      2.452      0.015       0.022       0.197
==============================================================================

结果是t_test返回的实例中的属性或方法。例如,conf_int可以通过

获得
>>> tt.conf_int()
array([[ 0.02157158,  0.19726175]])

t_test被矢量化并将每个限制或对比度视为单独的假设。 wald_test将一系列限制视为联合假设:

>>> tt = fitted1.t_test(["leg[T.L3] - leg[T.L2], leg[T.L4] - leg[T.L2]"])
>>> print(tt.summary())
                             Test for Constraints                             
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0            -0.0114      0.043     -0.265      0.792      -0.096       0.074
c1             0.1094      0.045      2.452      0.015       0.022       0.197
==============================================================================


>>> tt = fitted1.wald_test(["leg[T.L3] - leg[T.L2], leg[T.L4] - leg[T.L2]"])
>>> print(tt.summary())
<F test: F=array([[ 8.10128575]]), p=0.00038081249480917173, df_denom=277, df_num=2>

除此之外:如果将cov_type指定为fit的参数,这也适用于强大的协方差矩阵。