这个问题是Robert Picard在这里提供的优秀答案的延伸:How to Randomly Assign to Groups of Different Sizes
我们有这个数据集,与前一个问题相同,但添加了year
变量:
sysuse census, clear
keep state region pop
order state pop region
decode region, gen(reg)
replace reg="NCntrl" if reg=="N Cntrl"
drop region
gen year=20
replace year=30 if _n>15
replace year=40 if _n>35
如果我只是想在所有观察中重新随机分配reg
(不考虑群组),我可以实现上一篇文章的答案:
tempfile orig
save `orig'
keep reg
rename reg reg_new
set seed 234
gen double u = runiform()
sort u reg_new
merge 1:1 _n using `orig', nogen
如何修改代码以便reg
被洗牌,但只能在year
内?例如,有15个观察点year==20
。这些观察结果应该与其他年份分开进行改组。
答案 0 :(得分:2)
改组一个变量不需要任何文件编排。这可能会缩短:
sysuse auto, clear
set seed 2803
gen double shuffle = runiform()
* example 1
sort shuffle
gen long which = _n
sort mpg
gen mpg_new = mpg[which]
list which mpg*
* example 2
bysort foreign (shuffle) : gen long which2 = _n
bysort foreign (mpg) : gen mpg2 = mpg[which2]
list which2 mpg mpg2, sepby(foreign)
所有这一切,我认为只要您指定与数据集中的数字相同的样本大小,sample
就会这样做。这太过分了,因为你得到了所有变量。