我想从STATA转到R,但我面临着在我的数据中集成变量标签和查找值的挑战。在我的每个文本数据文件(.dat)的STATA中,我有一个字典文件(.dct),用于定义.dat中的变量类型和脚本文件(.do),我可以将变量名称和查找值集成到最终的二进制数据文件。例如:
infix using "otherincome.dct"
label variable HH_ID "Description of HH_ID"
label variable record_id "Description of record_id"
label variable income "Description of income"
label variable inctimes "Description of inctimes"
label variable incperiod "Description of incperiod"
label variable incsea "Description of incsea"
label variable incamnt "Description of incamnt"
label variable incgender "Description of incgender"
label variable incnotes "Description of incnotes"
label variable incfactor "Description of incfactor"
#delimit ;
label define LBL40
-999 "Not clear in paper"
-888 "Empty"
-777 "Other"
1 "Male"
2 "Female"
4 "Jointly"
;
label define LBL41
-999 "Not clear in paper"
-888 "Empty"
-777 "Other"
1 "Remittances"
2 "Bussines"
3 "Employment"
4 "Savings"
;
label define LBL42
-999 "Not clear in paper"
-888 "Empty"
-777 "Other"
1 "Hour"
2 "Day"
3 "Month"
4 "Year"
;
label define LBL43
-999 "Not clear in paper"
-888 "Empty"
-777 "Other"
1 "Long rains"
2 "Short rains"
3 "Dry spell"
4 "All seasons"
;
#delimit cr
label values incgender LBL40
label values income LBL41
label values incperiod LBL42
label values incsea LBL43
/* Save data a binary data*/
save otherincome
如果可以在R中做这样的事情吗?
答案 0 :(得分:0)
您无法为R中的常规变量分配标签。您可以设置自定义属性或其他内容来跟踪名称并为您的函数创建自定义打印对象,但这可能有点过分。
您的某些标签看起来像分类变量,在R中称为因子。你用
设置了这些incgender <- factor(c(4,1,1,2,-999,1,-777), levels=c(-999,-888,-777,1,2,4), labels=c("Not clear in paper", "Empty","Other","Male","Female","Jointly"));
print(incgender);