Generalized Linear Models in Matlab (same results as in R)

Question

在R中，在拟合glm之后，您可以获得包含剩余偏差和零偏差的摘要信息，该信息告诉您模型与仅具有截距项的模型相比有多好（例如模型）：

model <- glm(formula = am ~ mpg + qsec, data=mtcars, family=binomial)

我们有：

> summary(model)
...
    Null deviance: 43.2297  on 31  degrees of freedom
Residual deviance:  7.5043  on 29  degrees of freedom
AIC: 13.504
...

在Matlab中，当您使用fitglm时，您将返回GeneralizedLinearModel类的对象，该对象具有包含剩余偏差的Deviance属性。但是，我找不到与null deviance直接相关的任何内容。计算这个的最简单方法是什么？

示例Matlab代码：

load fisheriris.mat
model = fitglm(meas(:, 1), ismember(species, {'setosa'}), 'Distribution', 'binomial')

产生

model = 


Generalized Linear regression model:
    logit(y) ~ 1 + x1
    Distribution = Binomial

Estimated Coefficients:
                       Estimate                SE                  tStat                 pValue       
                   _________________    _________________    _________________    ____________________

    (Intercept)     27.8285213954246      4.8275686220899     5.76450042948896    8.19000695766331e-09
    x1             -5.17569812610148    0.893399843474784    -5.79326061438645    6.90328570107794e-09


150 observations, 148 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 119, p-value = 9.87e-28

剩余偏差为model.Deviance：

>> model.Deviance

ans =

          71.8363992272217

Answer 1

我为Matlab编写了一个GLM类，它给出了完全相同的结果：

Generalized Linear Models in Matlab (same results as in R)

例如，对样本数据进行伽马分布的对数链接GLM在R：

中给出了这一点

Call:
glm(formula = MilesPerGallon ~ Horsepower + Acceleration + Cylinders, 
    family = Gamma(link = log), data = data)

Deviance Residuals: 
      Min         1Q     Median         3Q        Max  
-0.116817  -0.075084   0.004179   0.060545   0.197108  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   4.955205   0.509903   9.718  < 2e-16 ***
Horsepower   -0.017605   0.004352  -4.046 5.21e-05 ***
Acceleration -0.026137   0.015540  -1.682   0.0926 .  
Cylinders     0.093277   0.054458   1.713   0.0867 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.0133)

    Null deviance: 0.388832  on 10  degrees of freedom
Residual deviance: 0.093288  on  7  degrees of freedom
AIC: 64.05

Number of Fisher Scoring iterations: 4

Pearson MSE:  0.008783281 
Deviance MSE:  0.008480725 
McFadden R^2:  0.7600815

使用该包，这个相同的估计在Matlab中给出了以下结果：

 :: convergence in 4 iterations
 ------------------------------------------------------------------------------------------
    dependent: MilesPerGallon
  independent: (Intercept),Horsepower,Acceleration,Cylinders
 ------------------------------------------------------------------------------------------
  log(E[MilesPerGallon]) = ß1×(Intercept) + ß2×Horsepower + ß3×Acceleration + ß4×Cylinders
 ------------------------------------------------------------------------------------------
 distribution: GAMMA
         link: LOG
       weight: -
       offset: -
 ============================================================
     Variable    Estimate     S.E.    z-value    Pr(>|z|)
 ============================================================
   (Intercept)      4.955     0.510    9.708     0.00000
    Horsepower     -0.018     0.004   -4.042     0.00005
  Acceleration     -0.026     0.016   -1.680     0.09290
     Cylinders      0.093     0.055    1.711     0.08706
 ============================================================
  Residual deviance:     0.0933     Deviance MSE: 0.0085
  Null deviance:         0.3888     Pearson MSE:  0.0088
  Dispersion:            0.0133     Deviance IC:  0.1026
  McFadden R^2:          0.7601     Residual df:  7.0000
 ============================================================

大致相同的输出。希望这可以帮助别人。

Answer 2

如果对fitglm的调用与表和使用Wilkinson表示法指定的回归一起使用，则生成的GeneralizedLinearModel对象model具有允许我们检索用于的表的属性适合模型，响应名称和分布。

由于R的零偏差只是模型与截距拟合的偏差，我们可以通过使用上述信息拟合null_deviance_model来找到它：

null_deviance_model = model.fit(model.Variables, ...
      [model.ResponseName, ' ~ 1'], 'Distribution', model.Distribution.Name);

与R的零偏差由null_deviance_model.Deviance给出。

我不确定这是否会延伸到使用矩阵和向量进行协变量/响应的回归。

你如何在Matlab fitglm中获得R的null和余数偏差等价物？

2 个答案:

Generalized Linear Models in Matlab (same results as in R)