Before running a model I am running a alias function to check multicollinearity. My code is:
>ss<-lm(final_res~.,data = dev1)
>summary(ss)
Output:
Call:
lm(formula = final_res ~ ., data = dev1)
Residuals:
Min 1Q Median 3Q Max
-0.32213 -0.02461 -0.01624 -0.00899 1.00588
Coefficients: (5 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.226e-02 6.156e-03 10.113 < 2e-16 ***
avg_dp_6month 1.344e-06 3.866e-06 0.348 0.72808
dp_increase 6.861e-03 6.973e-03 0.984 0.32519
dp_decrease -9.699e-03 1.222e-02 -0.794 0.42723
GrossMarginCR60 1.188e-03 3.005e-04 3.953 7.72e-05 ***
ClientCity -4.589e-06 8.414e-06 -0.545 0.58544
Gender -3.789e-03 8.908e-04 -4.254 2.10e-05 ***
ClientSubcategory -5.518e-03 1.399e-03 -3.943 8.04e-05 ***
FinalSegment 1.257e-02 1.485e-03 8.464 < 2e-16 ***
mob_201512 -6.363e-05 9.237e-06 -6.889 5.68e-12 ***
age_201512 -4.555e-04 3.603e-05 -12.641 < 2e-16 ***
wm_channel_flag -2.006e-02 7.122e-03 -2.816 0.00486 **
broking_activity_indicator 3.010e-03 7.209e-04 4.176 2.97e-05 ***
dp_status_flag 1.257e-02 3.130e-03 4.015 5.94e-05 ***
non_mf_tran_avg_6month -1.773e-03 3.170e-04 -5.592 2.25e-08 ***
non_mf_delivery_trade_avg_6month 1.866e-04 8.196e-05 2.276 0.02284 *
non_mf_trading_trade_avg_6month NA NA NA NA
non_mf_buy_tran_avg_6month 4.153e-03 6.389e-04 6.501 8.04e-11 ***
non_mf_sell_tran_avg_6month NA NA NA NA
non_mf_revenue_avg_6month -2.727e-07 8.693e-08 -3.136 0.00171 **
non_mf_quantity_avg_6month -2.143e-09 3.622e-09 -0.592 0.55410
non_mf_volume_avg_6month 2.585e-11 1.530e-11 1.689 0.09123 .
non_mf_normal_terminal_avg_6month 4.665e-03 2.662e-03 1.752 0.07973 .
non_mf_exe_offline_terminal_avg_6month NA NA NA NA
switcher_flag 5.557e-03 2.693e-03 2.064 0.03905 *
only_mf_flag NA NA NA NA
only_non_mf_flag -4.302e-02 3.669e-03 -11.726 < 2e-16 ***
only_mf_and_non_mf_flag NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1348 on 76943 degrees of freedom
(1427 observations deleted due to missingness)
Multiple R-squared: 0.01321, Adjusted R-squared: 0.01293
F-statistic: 46.84 on 22 and 76943 DF, p-value: < 2.2e-16
> alias(ss)
Model :
final_res ~ avg_dp_6month + dp_increase + dp_decrease + GrossMarginCR60 +
ClientCity + Gender + ClientSubcategory + FinalSegment +
mob_201512 + age_201512 + wm_channel_flag + broking_activity_indicator +
dp_status_flag + non_mf_tran_avg_6month + non_mf_delivery_trade_avg_6month +
non_mf_trading_trade_avg_6month + non_mf_buy_tran_avg_6month +
non_mf_sell_tran_avg_6month + non_mf_revenue_avg_6month +
non_mf_quantity_avg_6month + non_mf_volume_avg_6month + non_mf_normal_terminal_avg_6month +
non_mf_exe_offline_terminal_avg_6month + switcher_flag +
only_mf_flag + only_non_mf_flag + only_mf_and_non_mf_flag
Complete :
(Intercept) avg_dp_6month dp_increase dp_decrease GrossMarginCR60 ClientCity Gender ClientSubcategory FinalSegment mob_201512
non_mf_trading_trade_avg_6month 0 0 0 0 0 0 0 0 0 0
non_mf_sell_tran_avg_6month 0 0 0 0 0 0 0 0 0 0
non_mf_exe_offline_terminal_avg_6month 0 0 0 0 0 0 0 0 0 0
only_mf_flag 0 0 0 0 0 0 0 0 0 0
only_mf_and_non_mf_flag 1 0 0 0 0 0 0 0 0 0
age_201512 wm_channel_flag broking_activity_indicator dp_status_flag non_mf_tran_avg_6month non_mf_delivery_trade_avg_6month
non_mf_trading_trade_avg_6month 0 0 0 0 1 -1
non_mf_sell_tran_avg_6month 0 0 0 0 1 0
non_mf_exe_offline_terminal_avg_6month 0 0 0 0 1 0
only_mf_flag 0 0 0 0 0 0
only_mf_and_non_mf_flag 0 0 0 0 0 0
non_mf_buy_tran_avg_6month non_mf_revenue_avg_6month non_mf_quantity_avg_6month non_mf_volume_avg_6month
non_mf_trading_trade_avg_6month 0 0 0 0
non_mf_sell_tran_avg_6month -1 0 0 0
non_mf_exe_offline_terminal_avg_6month 0 0 0 0
only_mf_flag 0 0 0 0
only_mf_and_non_mf_flag 0 0 0 0
non_mf_normal_terminal_avg_6month switcher_flag only_non_mf_flag
non_mf_trading_trade_avg_6month 0 0 0
non_mf_sell_tran_avg_6month 0 0 0
non_mf_exe_offline_terminal_avg_6month -1 0 0
only_mf_flag 0 0 0
only_mf_and_non_mf_flag 0 0 -1
How to save the alias output (the complete section) in a dataframe? I have tried with tidy(from broom library).I have used it previously to store model summary as dataframe. But it is not working in here. So any suggestion or idea how to do this?
答案 0 :(得分:1)
alias
返回的列表只有两个或三个元素,其中一个是Complete
。稍微修改?alias
中的示例以检查它是否适用于lm
- 对象以及aov
一个(尽管帮助页明确说明它应该:
> op <- options(contrasts = c("contr.helmert", "contr.poly"))
> lm.mod <- lm(yield ~ block + N*P*K, npk)
> alias(lm.mod)$Complete
(Intercept) block1 block2 block3 block4 block5 N1 P1 K1 N1:P1
N1:P1:K1 0 1 1/3 1/6 -3/10 -1/5 0 0 0 0
N1:K1 P1:K1
N1:P1:K1 0 0
> as.data.frame(alias(lm.mod)$Complete)
(Intercept) block1 block2 block3 block4 block5 N1 P1 K1 N1:P1
N1:P1:K1 0 1 0.3333333 0.1666667 -0.3 -0.2 0 0 0 0
N1:K1 P1:K1
N1:P1:K1 0 0