来自rms包的fastbw回归函数的P值

时间:2015-06-06 07:43:25

标签: r regression linear-regression rms

我正在尝试使用rms包的fastbw函数进行反向回归,如下所示(使用mtcars数据集):

> mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) 
> mod

Linear Regression Model

ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars)

                Model Likelihood     Discrimination    
                   Ratio Test           Indexes        
Obs       32    LR chi2     58.26    R2       0.838    
sigma 2.7008    d.f.            6    R2 adj   0.799    
d.f.      25    Pr(> chi2) 0.0000    g        6.383    

Residuals

    Min      1Q  Median      3Q     Max 
-4.3807 -1.4314 -0.5405  1.5828  5.4703 

          Coef    S.E.   t     Pr(>|t|)
Intercept 39.9804 8.8745  4.51 0.0001  
am         1.5981 1.9927  0.80 0.4301  
vs         0.8011 1.9201  0.42 0.6801  
cyl       -1.3163 0.7033 -1.87 0.0730  
drat       0.3488 1.6201  0.22 0.8313  
wt        -3.0390 0.9510 -3.20 0.0038  
gear      -1.1450 1.1420 -1.00 0.3256  


> modbw = fastbw(mod)
> modbw

 Deleted Chi-Sq d.f. P      Residual d.f. P      AIC   R2   
 drat    0.05   1    0.8296 0.05     1    0.8296 -1.95 0.838
 vs      0.17   1    0.6800 0.22     2    0.8974 -3.78 0.837
 am      0.58   1    0.4473 0.79     3    0.8509 -5.21 0.833
 gear    0.42   1    0.5194 1.21     4    0.8766 -6.79 0.830

Approximate Estimates after Deleting Factors

            Coef   S.E. Wald Z          P
Intercept 39.686 1.8040 21.999 0.00000000
cyl       -1.508 0.4362 -3.457 0.00054706
wt        -3.191 0.7962 -4.008 0.00006128

Factors in Final Model

[1] cyl wt 

以下是该模型的结构:

> str(modbw)
List of 10
 $ result         : num [1:4, 1:8] 0.0463 0.1701 0.5775 0.4152 1 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:4] "drat" "vs" "am" "gear"
  .. ..$ : chr [1:8] "Chi-Sq" "d.f." "P" "Residual" ...
 $ names.kept     : chr [1:2] "cyl" "wt"
 $ factors.kept   : int [1:2] 3 5
 $ factors.deleted: int [1:4] 4 2 1 6
 $ parms.kept     : int [1:3] 1 4 6
 $ parms.deleted  : int [1:4] 5 3 2 7
 $ coefficients   : Named num [1:3] 39.69 -1.51 -3.19
  ..- attr(*, "names")= chr [1:3] "Intercept" "cyl" "wt"
 $ var            : num [1:3, 1:3] 3.254 -0.303 -0.358 -0.303 0.19 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:3] "Intercept" "cyl" "wt"
  .. ..$ : chr [1:3] "Intercept" "cyl" "wt"
 $ Coefficients   : num [1:4, 1:7] 41.26 43.17 42.39 39.69 1.68 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:7] "Intercept" "am" "vs" "cyl" ...
 $ force          : NULL
 - attr(*, "class")= chr "fastbw"

以下是摘要的输出结构(summary.lm函数不适用于此模型):

> summary(modbw)
                Length Class  Mode     
result          32     -none- numeric  
names.kept       2     -none- character
factors.kept     2     -none- numeric  
factors.deleted  4     -none- numeric  
parms.kept       3     -none- numeric  
parms.deleted    4     -none- numeric  
coefficients     3     -none- numeric  
var              9     -none- numeric  
Coefficients    28     -none- numeric  
force            0     -none- NULL     
> 
> summary.lm(modbw)
Error in if (p == 0) { : argument is of length zero

但我无法在其中任何一个中找到P值。如何获得fastbw函数最终模型的P值列表?

2 个答案:

答案 0 :(得分:2)

p值的计算发生在print.fastbw函数中,由于某种原因,它们不会从函数返回。我打算使用print.fastbw的源代码自己重新计算它们,但我发现重写我自己的print.fastbw函数来返回p值要快得多。

这是重新设计的功能(注意print2不是通用的):

print2.fastbw <- function (x, digits = 4, estimates = TRUE, ...) 
{
  res <- x$result
  fd <- x$factors.deleted
  if (length(fd)) {
    cres <- cbind(dimnames(res)[[1]], format(round(res[, 
                                                       1], 2)), format(res[, 2]), format(round(res[, 3], 
                                                                                               4)), format(round(res[, 4], 2)), format(res[, 5]), 
                  format(round(res[, 6], 4)), format(round(res[, 7], 
                                                           2)), if (ncol(res) > 7) 
                                                             format(round(res[, 8], 3)))
    dimnames(cres) <- list(rep("", nrow(cres)), c("Deleted", 
                                                  dimnames(res)[[2]]))
    cat("\n")
    if (length(x$force)) 
      cat("Parameters forced into all models:\n", paste(x$force, 
                                                        collapse = ", "), "\n\n")
    print(cres, quote = FALSE)
    if (estimates && length(x$coef)) {
      cat("\nApproximate Estimates after Deleting Factors\n\n")
      cof <- coef(x)
      vv <- if (length(cof) > 1) 
        diag(x$var)
      else x$var
      z <- cof/sqrt(vv)
      stats <- cbind(cof, sqrt(vv), z, 1 - pchisq(z^2, 
                                                  1))
      dimnames(stats) <- list(names(cof), c("Coef", "S.E.", 
                                            "Wald Z", "P"))
      return(stats)
    }
  }
  else cat("\nNo Factors Deleted\n")
  cat("\nFactors in Final Model\n\n")
  nk <- x$names.kept
  if (length(nk)) 
    print(nk, quote = FALSE)
  else cat("None\n")
}

输出:

> results <- print2.fastbw(modbw)

 Deleted Chi-Sq d.f. P      Residual d.f. P      AIC   R2   
 drat    0.05   1    0.8296 0.05     1    0.8296 -1.95 0.838
 vs      0.17   1    0.6800 0.22     2    0.8974 -3.78 0.837
 am      0.58   1    0.4473 0.79     3    0.8509 -5.21 0.833
 gear    0.42   1    0.5194 1.21     4    0.8766 -6.79 0.830

Approximate Estimates after Deleting Factors

> results
               Coef      S.E.    Wald Z            P
Intercept 39.686261 1.8039853 21.999216 0.000000e+00
cyl       -1.507795 0.4362091 -3.456588 5.470608e-04
wt        -3.190972 0.7961871 -4.007817 6.128261e-05

和p值:

> results[,4]
   Intercept          cyl           wt 
0.000000e+00 5.470608e-04 6.128261e-05 

答案 1 :(得分:2)

它们是即时计算的。在rms:::print.fastbw内挖掘(类print的对象的fastbw方法),您可以找到:

   cof <- coef(x)
   vv <- if (length(cof) > 1) diag(x$var) else x$var
   z <- cof/sqrt(vv)
   stats <- cbind(cof, sqrt(vv), z, 1 - pchisq(z^2, 1))

(如果您想要更准确的小p值,最好将pchisq(z^2,1,lower.tail=FALSE)替换为1-pchisq(z^2,1)

如果xfastbw个对象,则stats的最后一列会显示您的p值。