如何从回归树中提取顶部变量并在回归中使用?

时间:2013-08-15 20:27:36

标签: r statistics regression

所以我使用 rpart 创建了一个回归树,输出到 reg_tree

# show summary statistics of reg_tree
summary(reg_tree)

# store top variables as new values
topvars <- reg_tree$variable.importance

# output of topvars
topvars 

q_21fb1900   q_2b3296a0          q_0   q_fde6a01e   q_7fa850ed   q_323d6cee   q_c6ab3657   q_eb2ad90d   q_5dcb2b57 
5.303283e+15 5.196871e+15 4.002239e+15 4.412505e+14 2.616730e+14 2.162128e+14 2.035465e+14 1.354927e+14 5.095959e+13 
  q_af2830be   q_caa61b2c   q_a6828865   q_99f5a0bd   q_be83fe28   q_efdc29dd   q_9e86aa7f   q_2ea0e2aa   q_5049294d 
2.176437e+13 1.210118e+13 1.126591e+13 8.387189e+12 4.951978e+12 4.115929e+12 3.864235e+12 1.449853e+12 5.436949e+11 
  q_5ae0f0cd   q_518fba14 
5.436949e+11 5.412242e+11

我想将这些名称中的每一个提取为xvar1,xvar2并自动将它们放在以下模型中,其中每个xvar对应于列标题:lm(y_var ~ xvar1 + xvar2 + xvar3 + ... +,data)

lm(y_var ~ q_21fb1900 + q_2b3296a0 + q_0 + ... +,data)

我如何做到这一点,以便我可以放入新的数据集,而不用担心将来自定义更改每个xvar?

1 个答案:

答案 0 :(得分:1)

试试这个:

示例:

reg_tree <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)
topvars <- reg_tree$variable.importance
myreg<-lm(as.formula(paste("as.numeric(Kyphosis) ~ ",paste(names(topvars), collapse = " + "), sep = "")),data=kyphosis)
> summary(myreg)

Call:
lm(formula = as.formula(paste("as.numeric(Kyphosis) ~ ", paste(names(topvars), 
    collapse = " + "), sep = "")), data = kyphosis)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.79440 -0.22356 -0.08478  0.10205  0.84768 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.2612198  0.1934124   6.521 6.61e-09 ***
Start       -0.0307392  0.0091166  -3.372  0.00117 ** 
Age          0.0010657  0.0006937   1.536  0.12858    
Number       0.0525555  0.0274522   1.914  0.05928 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.3599 on 77 degrees of freedom
Multiple R-squared: 0.2575, Adjusted R-squared: 0.2285 
F-statistic:   8.9 on 3 and 77 DF,  p-value: 3.912e-05