在组内随机变换一个变量

时间:2018-02-20 14:29:59

标签: random dataset stata assign

这个问题是Robert Picard在这里提供的优秀答案的延伸:How to Randomly Assign to Groups of Different Sizes

我们有这个数据集,与前一个问题相同,但添加了year变量:

sysuse census, clear
keep state region pop
order state pop region
decode region, gen(reg)
replace reg="NCntrl" if reg=="N Cntrl"
drop region
gen year=20 
replace year=30 if _n>15
replace year=40 if _n>35

如果我只是想在所有观察中重新随机分配reg(不考虑群组),我可以实现上一篇文章的答案:

tempfile orig
save `orig'
keep reg
rename reg reg_new
set seed 234
gen double u = runiform()
sort u reg_new
merge 1:1 _n  using `orig', nogen

如何修改代码以便reg被洗牌,但只能在year内?例如,有15个观察点year==20。这些观察结果应该与其他年份分开进行改组。

1 个答案:

答案 0 :(得分:2)

改组一个变量不需要任何文件编排。这可能会缩短:

sysuse auto, clear 
set seed 2803 

gen double shuffle = runiform() 

* example 1 
sort shuffle 
gen long which = _n 
sort mpg 
gen mpg_new = mpg[which] 
list which mpg* 

* example 2 
bysort foreign (shuffle) : gen long which2 = _n 
bysort foreign (mpg) : gen mpg2 = mpg[which2] 
list which2 mpg mpg2, sepby(foreign) 

所有这一切,我认为只要您指定与数据集中的数字相同的样本大小,sample就会这样做。这太过分了,因为你得到了所有变量。