更新: 当我尝试“dput”显示我的一些数据时,我能够看到的只是一长串的0和1然后:
label = "CURRENT JOB STATUS- 1", class = c("labelled",
"integer"))), .Names = c("is.controlrp", "is.lowrp", "is.somerp",
"is.highrp", "is.substantialrp", "is.ignorerp", "Risk_Pct", "PCT_Stocks_MF_1",
"is.native", "is.male", "rp", "rp2", "is.lowrp.1", "is.controlrp.1",
"is.somerp.1", "is.highrp.1", "is.substantial2", "is.ignorerp.1",
"psm1", "psm2", "noStocks", "is.control2", "is.low2", "is.some2",
"is.high2", "is.substantial2.1", "is.ignore2", "age2", "agenew",
"tw", "is.controlW", "is.debt", "is.lowW", "is.medW", "is.highW",
"sme", "is.controlSME", "is.lowSME", "oh", "ms", "cjs"), row.names = c(NA,
-16000L), class = "data.frame")
如果这有帮助,这里是“头”:
> head(dummydata2)
is.controlrp is.lowrp is.somerp is.highrp is.substantialrp is.ignorerp Risk_Pct PCT_Stocks_MF_1
1 1 1 1 0 1 1 7 50
2 1 1 1 1 0 1 10 NA
3 1 1 1 0 1 1 8 40
4 1 1 1 0 1 1 8 40
5 1 1 1 0 1 1 8 40
6 1 1 0 1 1 1 5 998
is.native is.male rp rp2 is.lowrp.1 is.controlrp.1 is.somerp.1 is.highrp.1
1 0 1 high 3 1 1 1 0
2 0 0 substantial 4 1 1 1 1
3 0 1 high 3 1 1 1 0
4 0 1 high 3 1 1 1 0
5 0 1 high 3 1 1 1 0
6 0 1 some 2 1 1 0 1
is.substantial2 is.ignorerp.1 psm1 psm2 noStocks is.control2 is.low2 is.some2 is.high2
1 1 1 3 high 1 1 1 1 0
2 <NA> 1 NA <NA> NA <NA> <NA> <NA> <NA>
3 1 1 2 some 1 1 1 0 1
4 1 1 2 some 1 1 1 0 1
5 1 1 2 some 1 1 1 0 1
6 1 1 5 ignore 1 1 1 1 1
is.substantial2.1 is.ignore2 age2 agenew tw is.controlW is.debt is.lowW is.medW is.highW sme
1 1 1 2 1 4 1 1 1 1 0 NA
2 <NA> <NA> 3 1 1 1 0 1 1 1 NA
3 1 1 3 1 NA NA NA NA NA NA NA
4 1 1 3 1 NA NA NA NA NA NA NA
5 1 1 3 1 NA NA NA NA NA NA NA
6 1 0 3 1 4 1 1 1 1 0 4
is.controlSME is.lowSME oh ms cjs
1 NA <NA> 0 1 1
2 NA <NA> 1 1 1
3 NA <NA> 0 0 0
4 NA <NA> 0 0 0
5 NA <NA> 0 0 0
6 1 1 0 1 0
您好我想帮助在“r”中绘制以下回归:
fit1_usesrp <-vglm(rp ~ is.native + is.male + oh + cjs + age2 + tw ,propodds, data = dummydata
通过这种回归,我有兴趣了解移民身份是否是风险偏好的决定因素。回归中的“rp”是一个分类变量,用于衡量受访者承担风险的意愿(没有风险承受能力,低风险承受能力等)。 “is.native”是移民身份的虚拟变量(0 =原生,1 =移民)。哦,cjs和age2都是可能影响风险偏好的因素的虚拟变量。 “tw”代表总财富,我用财富的实际原始价值而不是试图将其变成虚拟变量。
此回归的输出是:
Call:
vglm(formula = rp ~ is.native + is.male + oh + cjs + age2 + tw,
family = propodds, data = dummydata2)
Coefficients:
(Intercept):1 (Intercept):2 (Intercept):3 (Intercept):4 (Intercept):5 is.native is.male
2.796024e+00 -4.948484e-01 -5.055242e-01 -6.460196e-01 -2.545089e+00 -7.381110e-02 1.509080e-01
oh cjs age2 tw
-2.070633e-02 3.643551e-03 7.149891e-02 -3.092727e-06
Degrees of Freedom: 52000 Total; 51989 Residual
Residual deviance: 24705.39
Log-likelihood: -12352.69
> summary(fit1_usesrp)
Call:
vglm(formula = rp ~ is.native + is.male + oh + cjs + age2 + tw,
family = propodds, data = dummydata2)
Pearson residuals:
Min 1Q Median 3Q Max
logit(P[Y>=2]) -5.0292 0.1305 0.2830 0.2989 0.5201
logit(P[Y>=3]) -0.6676 -0.5986 -0.5517 0.5911 14.8594
logit(P[Y>=4]) -13.1696 -0.5270 -0.4856 0.6051 3.0437
logit(P[Y>=5]) -4.0121 -0.4082 -0.3802 1.0345 1.2412
logit(P[Y>=6]) -0.6133 -0.5764 -0.1601 -0.1548 3.5492
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 2.796e+00 6.689e-02 41.803 < 2e-16 ***
(Intercept):2 -4.948e-01 5.352e-02 -9.246 < 2e-16 ***
(Intercept):3 -5.055e-01 5.353e-02 -9.444 < 2e-16 ***
(Intercept):4 -6.460e-01 5.368e-02 -12.035 < 2e-16 ***
(Intercept):5 -2.545e+00 6.113e-02 -41.635 < 2e-16 ***
is.native -7.381e-02 6.224e-02 -1.186 0.2357
is.male 1.509e-01 3.764e-02 4.010 6.08e-05 ***
oh -2.071e-02 4.747e-02 -0.436 0.6627
cjs 3.644e-03 4.365e-02 0.083 0.9335
age2 7.150e-02 2.449e-02 2.920 0.0035 **
tw -3.093e-06 1.479e-06 -2.092 0.0365 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Number of linear predictors: 5
Names of linear predictors:
logit(P[Y>=2]), logit(P[Y>=3]), logit(P[Y>=4]), logit(P[Y>=5]), logit(P[Y>=6])
Dispersion Parameter for cumulative family: 1
Residual deviance: 24705.39 on 51989 degrees of freedom
Log-likelihood: -12352.69 on 51989 degrees of freedom
Number of iterations: 4
Exponentiated coefficients:
is.native is.male oh cjs age2 tw
0.9288471 1.1628897 0.9795066 1.0036502 1.0741170 0.9999969