从下面的两个回归可以看出,由于数据矩阵的奇异性,lm()
在rlm()
崩溃时给出解决方案。 lm()
内部会删除一个因子级别以避免奇点,但rlm()
不会。
简单线性回归案例:
result.lm <- lm(log(export + import) ~ log(gdp.i*gdp.j) +
log(dis) + log(Sij) + AFC + GFC + I(dpgdp*0.001)+
factor(id),
data = mydata)
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -130.56396 34.10023 -3.829 0.000177 ***
log(gdp.i * gdp.j) 1.73176 0.20873 8.297 2.39e-14 ***
log(dis) 6.89208 4.75270 1.450 0.148750
log(Sij) -5.18435 1.80221 -2.877 0.004502 **
AFC -1.00819 0.86188 -1.170 0.243640
GFC 0.49326 0.58950 0.837 0.403834
I(dpgdp * 0.001) -0.05713 0.03733 -1.530 0.127701
factor(id)IDN_PHL -7.02467 4.46062 -1.575 0.117044
factor(id)IDN_SGP 4.10315 1.42839 2.873 0.004558 **
factor(id)IDN_THA -3.37530 3.44619 -0.979 0.328675
factor(id)IDN_VNM -11.75983 5.24573 -2.242 0.026189 *
factor(id)MYS_SGP 12.16543 6.13940 1.982 0.049045 *
factor(id)MYS_THA 2.75659 0.72603 3.797 0.000200 ***
factor(id)MYS_VNM -5.31554 3.01239 -1.765 0.079325 .
factor(id)PHL_MYS -3.74970 3.82106 -0.981 0.327742
factor(id)PHL_SGP -3.72441 3.84997 -0.967 0.334642
factor(id)PHL_THA -2.32179 3.26691 -0.711 0.478187
factor(id)PHL_VNM -2.43611 2.32941 -1.046 0.297045
factor(id)SGP_THA 4.18854 1.68147 2.491 0.013639 *
factor(id)SGP_VNM -0.54607 3.62445 -0.151 0.880409
factor(id)THA_VNM NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.982 on 181 degrees of freedom
Multiple R-squared: 0.5808, Adjusted R-squared: 0.5368
F-statistic: 13.2 on 19 and 181 DF, p-value: < 2.2e-16
强大的回归案例:
为了避免来自MASS包的rlm()中的奇点,我使用了:
library(car); library(MASS)
result.rlm <- rlm(log(export + import) ~ log(gdp.i*gdp.j) +
log(dis) + log(Sij) + AFC + GFC + I(dpgdp*0.001)+
factor(id, exclude == "THA_VNM"),
data = mydata)
返回错误消息:
Error in rlm.default(x, y, weights, method = method, wt.method = wt.method, :
'x' is singular: singular fits are not implemented in 'rlm'
In addition: Warning message:
In as.vector(exclude, typeof(x)) : NAs introduced by coercion
如何从factor ID中排除一个级别以从rlm()函数中获取结果?
可用于复制问题的mydata
部分如下:
id export import gdp.i gdp.j dis Sij AFC GFC dpgdp
PHL_MYS 21090 54082 1.03E+11 1.44E+11 2470.863 0.243267763 0 0 4352.999196
IDN_MYS 1273344 6350191 2.86E+11 1.44E+11 1174.196 0.222531092 0 0 4280.470783
IDN_PHL 1352286 6501568 2.86E+11 1.03E+11 2792.088 0.194772855 0 0 72.528413
MYS_SGP 11849639 3100352 1.44E+11 1.24E+11 315.5433 0.248594031 0 0 23398.89647
PHL_SGP 1010140 3406594 1.03E+11 1.24E+11 2396.775 0.247965171 0 0 27751.89567
IDN_SGP 62247342 2374634 2.86E+11 1.24E+11 886.1407 0.210675425 0 0 27679.36726
PHL_THA 126863901 131288917 1.03E+11 1.76E+11 2210.015 0.232802184 0 0 1489.015435
IDN_THA 174813908 406988998 2.86E+11 1.76E+11 2316.466 0.23596528 0 0 1416.487022
SGP_THA 131102650 238482275 1.24E+11 1.76E+11 1433.936 0.242235497 0 0 26262.88023
MYS_THA 45626339 92914926 1.44E+11 1.76E+11 1187.123 0.247368503 0 0 2863.983761
IDN_VNM 14635829 1705791 2.86E+11 57633255739 3023.314 0.139630737 0 0 573.9790196
MYS_VNM 1140384 19607 1.44E+11 57633255739 2040.94 0.204415889 0 0 4854.449802
SGP_VNM 5413912 5137507 1.24E+11 57633255739 2207.195 0.216937571 0 0 28253.34628
THA_VNM 10375 316 1.76E+11 57633255739 990.7018 0.185642137 0 0 1990.466041
IDN_MYS 61500692 3164431 3.65E+11 1.63E+11 1174.196 0.213350529 0 0 4578.607187
IDN_SGP 12985625 23866106 3.65E+11 1.39E+11 886.1407 0.199850336 0 0 29984.59857
PHL_SGP 18400 116669 1.22E+11 1.39E+11 2396.775 0.248964804 0 0 30186.80165
MYS_SGP 14410298 2247747 1.63E+11 1.39E+11 315.5433 0.248461189 0 0 25405.99138
MYS_THA 68755379 26223833 1.63E+11 2.07E+11 1187.123 0.246396222 0 0 3036.402002
IDN_THA 49410654 138502983 3.65E+11 2.07E+11 2316.466 0.231027427 0 0 1542.205185
PHL_THA 72197200 166850064 1.22E+11 2.07E+11 2210.015 0.233390872 0 0 1744.408267
SGP_THA 132220146 277333084 1.39E+11 2.07E+11 1433.936 0.240330641 0 0 28442.39339
PHL_VNM 40525 3475176 1.22E+11 66371664817 1750.016 0.228081191 0 0 602.1758296
SGP_VNM 9686544 6916182 1.39E+11 66371664817 2207.195 0.218722399 0 0 30788.97748
MYS_VNM 118597 107725 1.63E+11 66371664817 2040.94 0.205795797 0 0 5382.986099
THA_VNM 2925753 11249569 2.07E+11 66371664817 990.7018 0.183801906 0 0 2346.584097
IDN_MYS 88079132 24559821 4.32E+11 1.94E+11 1174.196 0.213634936 0 0 5347.115267
IDN_PHL 25877152 6138473 4.32E+11 1.49E+11 2792.088 0.190862982 0 0 190.7373159
MYS_SGP 25406889 2592050 1.94E+11 1.69E+11 315.5433 0.248823886 0 0 29547.92915
IDN_SGP 104020998 4359943 4.32E+11 1.69E+11 886.1407 0.201927152 0 0 34895.04442
PHL_THA 51290535 259950903 1.49E+11 2.47E+11 2210.015 0.234834327 0 0 2057.166994
IDN_THA 15456842 82233669 4.32E+11 2.47E+11 2316.466 0.2314039 0 0 1866.429678
MYS_THA 25580025 405724623 1.94E+11 2.47E+11 1187.123 0.246323269 0 0 3480.685589
SGP_THA 109397804 181203225 1.69E+11 2.47E+11 1433.936 0.241136255 0 0 33028.61474
IDN_VNM 116169 411089 4.32E+11 77414425532 3023.314 0.128828319 0 0 952.108653
MYS_VNM 78770 5099 1.94E+11 77414425532 2040.94 0.204073979 0 0 6299.22392
PHL_VNM 12322 442466 1.49E+11 77414425532 1750.016 0.224837139 0 0 761.3713371
SGP_VNM 12167407 6959737 1.69E+11 77414425532 2207.195 0.215604147 0 0 35847.15307
THA_VNM 192568 15221723 2.47E+11 77414425532 990.7018 0.181693618 0 0 2818.538331
IDN_MYS 195446734 38077097 5.10E+11 2.31E+11 1174.196 0.214515503 0 0 6282.103658
IDN_PHL 2221 1074 5.10E+11 1.74E+11 2792.088 0.18941607 0 0 257.2702704
IDN_SGP 137780587 131012335 5.10E+11 1.79E+11 886.1407 0.192218812 0 0 34794.08436
MYS_SGP 29608269 1785983 2.31E+11 1.79E+11 315.5433 0.245966948 0 0 28511.9807
IDN_THA 83960790 638313022 5.10E+11 2.73E+11 2316.466 0.226956384 0 0 1940.13688
PHL_THA 93904304 489639916 1.74E+11 2.73E+11 2210.015 0.237698194 0 0 2197.40715
MYS_THA 27635575 463572856 2.31E+11 2.73E+11 1187.123 0.248294683 0 0 4341.966779
SGP_THA 91150086 272714486 1.79E+11 2.73E+11 1433.936 0.239243442 0 0 32853.94748
SGP_VNM 32692201 8777399 1.79E+11 99130304099 2207.195 0.229411821 0 0 35807.78867
PHL_VNM 1183981 452291 1.74E+11 99130304099 1750.016 0.231359489 0 0 756.4340478
MYS_VNM 339799 1114755 2.31E+11 99130304099 2040.94 0.210114801 0 0 7295.807976
THA_VNM 278151 32005 2.73E+11 99130304099 990.7018 0.195565711 0 0 2953.841198
IDN_VNM 40753 568034 5.10E+11 99130304099 3023.314 0.13621204 0 0 1013.704318
答案 0 :(得分:0)
让我们以稍微迂回的方式解决问题,首先根据id创建一个虚拟矩阵,然后运行rlm函数,排除某些级别对应于某些级别的某些列。
# create dummy matrix for id
idx <- sort(unique(mydata$id))
dummy <- matrix(NA, nrow = nrow(mydata), ncol = length(idx))
for (j in 1:length(idx)){
dummy[,j] <- as.integer(mydata$id == idx[j])
}
dummy <- data.frame(dummy)
names(dummy) <- idx
mydata <- cbind(mydata, dummy)
# run rlm excluding some levels (e.g. levels 14 and 15) of id
model <- as.formula(paste("log(export + import) ~ log(gdp.i*gdp.j) +
log(dis)+ log(Sij) + AFC + GFC + I(dpgdp*0.001)",
paste(unique(mydata$id)[1:13], collapse = " + "),sep="+"))
result.rlm <- rlm(model, data = mydata)
summary(result.rlm)
Call: rlm(formula = model, data = mydata)
Residuals:
Min 1Q Median 3Q Max
-9.17356 -1.21478 0.07208 1.25235 5.18840
Coefficients:
Value Std. Error t value
(Intercept) -333.2742 48.8993 -6.8155
log(gdp.i * gdp.j) 1.2882 0.0745 17.2959
log(dis) 37.6566 6.1469 6.1262
log(Sij) 1.4847 0.6695 2.2174
AFC 0.4229 0.2791 1.5152
GFC -0.0674 0.2331 -0.2892
I(dpgdp * 0.001) -0.0819 0.0137 -5.9591
IDN_MYS 20.1923 3.4593 5.8371
IDN_PHL -13.8168 1.9475 -7.0948
IDN_SGP 34.1516 5.4185 6.3028
MYS_SGP 72.8390 11.7699 6.1886
IDN_THA -5.9810 0.8392 -7.1271
MYS_THA 20.6914 3.3979 6.0894
SGP_THA 15.8244 2.4896 6.3563
IDN_VNM -16.0143 2.4548 -6.5236
THA_VNM 28.0544 4.5140 6.2150
PHL_MYS -7.0767 1.1393 -6.2115
PHL_SGP -3.3307 0.7094 -4.6954
PHL_THA -2.5637 0.5553 -4.6163
PHL_VNM 4.8306 1.0629 4.5445
Residual standard error: 1.827 on 1455 degrees of freedom