我正在尝试merge
来自多个较小数据集的完整数据集:
cd "\\files
use "\\files\Creatinine.dta"
*merging with report data for baseline demographics *
merge m:1 id using "Archive\Report.dta"
* keeping only those tranplanted 2002-2015 *
drop if tx1 <= date("01/01/2002", "DMY") | tx1 >= date("31/12/2015", "DMY")
drop _merge
* labelling variables *
label define org 1 "Heart" 2 "Lung" 3 "Liver" 5 "Multiple" 6 "Small Bowel" 7
"Pancreas" 8 "Stomach"
label values organ1 organ2 organ3 org
label values multi1 multi2 multi3 multi4 org
label variable organ1 "First Organ"
label variable organ2 "Second Organ"
label variable organ3 "Third Organ"
label variable donor_type1 "First Donor Type"
label variable tx1 "Date of First Transplant"
label variable tx2 "Date of Second Transplant"
label variable tx3 "Date of Third Transplant"
label variable dob "Date of Birth"
label variable tx1_loc "First Transplant Location"
label variable multi1 "Multiple Organ 1"
label variable multi2 "Multiple Organ 2"
label variable multi3 "Multiple Organ 3"
label variable multi4 "Multiple Organ 4"
label variable censor_date "Censor Date"
label define loc 1 "Hospital"
label values tx1_loc loc
label define sex1 1 "Male" 2 "Female"
label values sex sex1
label variable sex "Sex of Child"
label define donor 1 "Living" 2 "Deceased"
label values donor_type1 donor
order dob sex tx1 tx1_loc organ1 donor_type1 multi1 multi2 multi3 multi4 organ2 tx2_date organ3 tx3_date censor_date DeathDate, after(id)
***Data Cleaning *
generate dateCollected = date(DateCollected, "DMY")**
format %tdCCYY/NN/DD dateCollected
codebook dateCollected
drop DateCollected
rename dateCollected DateCollected
order DateCollected TimeCollected, after (Test)
*dropping duplicates *
sort id DateCollected TimeCollected Result
quietly by id DateCollected TimeCollected Result: gen dup=cond(_N==1,0,_n)
drop if dup > 1
drop dup
*save *
save "\\files\Injury.dta"
我在代码中已经到了这一行:
generate dateCollected = date(DateCollected, "DMY")
然而,它给我一个类型不匹配错误。
我认为这是由于creatinine
文件和report
文件之间的日期格式造成的。
请看一下并提出建议。非常感谢。
数据
Creatine.dta(仅显示一个结果,每个ID多个结果)
id dob sex tx1 tx1_loc organ1 donor_type1 censor_date DeathDate DateCollected Test Result Units
2010003 15-Apr-07 Female 29-Jan-09 Hospital Heart Deceased 30-Jun-16 12/5/2007 Creatinine,blood 25 umol/L
Report.dta(仅显示一个ID)
id dob sex tx1 tx1_loc organ1 donor_type1 multi1 multi2 multi3 multi4 organ2 tx2_date organ3 tx3_date censor_date DeathDate
2010003 15-Apr-07 2 29-Jan-09 1 1 2 30-Jun-16
答案 0 :(得分:3)
请注意,问题表面在之后执行了merge
。
您收到r(109)
错误,因为您尝试使用 numeric 变量上的generate
函数date()
新变量。此函数需要字符串(变量)作为输入。
我不确定你为什么要这样做,但是如果你只是想创建并使用dateCollected
进行进一步的工作,同时保留DateCollected
作为备份,你可以简单地克隆它:
clonevar dateCollected = DateCollected
修改强>
阐述我的评论:
. clear
. set obs 1
number of observations (_N) was 0, now 1
. generate DateCollected_String = "12/05/2007"
. generate DateCollected = date(DateCollected_String, "DMY")
. format %tdDD/NN/CCYY DateCollected
. browse
. generate dateCollected = date(DateCollected, "DMY")
type mismatch
r(109);