我正在尝试编写一个循环来生成并填写一个虚拟变量,以确定某个人是否是该年度某个特定方的成员。我的数据很长,每次观察都是一个人,一年。它看起来如下。
X1 X2 X3
AR, 1972-1981 PDC, 1982-1986 PFL, 1986-.
MD, 1966-1980 PMDB, 1980-1988 PSB, 1988-.
MD, 1966-1968 AR, 1968-1980 PDS, 1980-1985
在逗号是党之前,之后是该人是该党员的年份。 任何帮助将不胜感激!
到目前为止,我的代码是:
rename X1 XA
rename X2 XB
rename X3 XC
foreach var of varlist XA XB XC{
split `var', parse (,)
}
tabulate XA1, gen(p)
答案 0 :(得分:2)
这是一种方法。我不得不假设X3中缺少的年份对应了什么,所以你需要改变它。
/* Enter Data */
clear
input str20 X1 str20 X2 str20 X3
"AR, 1972-1981" "PDC, 1982-1986" "PFL, 1986-."
"MD, 1966-1980" "PMDB, 1980-1988" "PSB, 1988-."
"MD, 1966-1968" "AR, 1968-1980" "PDS, 1980-1985"
end
compress
/* Split X1,X2,X3 into party, start year and end year and create 3 ID variables that we need later */
forvalues v=1/3 {
split X`v', parse(", " "-")
gen id`v'=_n
}
/* Makes years numeric, and get rid of messy original data */
destring X12 X13 X22 X23 X32 X33, replace
replace X33 = 1990 if missing(X33) // enter your survey year here
drop X1 X2 X3
/* stack the spells on top of each other */
stack (id1 X11 X12 X13) (id2 X21 X22 X23) (id3 X31 X32 X33), into(id party year1 year2) clear
drop _stack
/* Put the data into long format and fill in the gaps */
reshape long year, i(id party) j(p)
drop p
/* need this b/c people can be in more than one party in a given year */
egen idparty = group(id party), label
xtset idparty year
tsfill
carryforward id party, replace
drop idparty
/* create party dummies */
tab party, gen(DD_)
/* rename the dummies to have party affiliation at the end instead of numbers */
foreach var of varlist DD_* {
levelsof party if `var'==1, local(party) clean
rename `var' ind_`party'
}
drop party
/* get back down to one person-year observation */
collapse (max) ind_*, by(id year)
list id year ind_*, sepby(id) noobs
答案 1 :(得分:1)
遵循Dimitriy的领导(和解释),这是一种略微不同的方式。我对丢失的端点做了不同的假设,即我将系列截断到最后的已知年份。
clear
set more off
input ///
str15 (XA XB XC)
"AR, 1972-1981" "PDC, 1982-1986" "PFL, 1986-."
"MD, 1966-1980" "PMDB, 1980-1988" "PSB, 1988-."
"MD, 1966-1968" "AR, 1968-1980" "PDS, 1980-1985"
end
list
*----- what you want? -----
// main
stack X*, into(X) clear
bysort _stack: gen id = _n
order id, first
split X, parse (, -)
rename (X1 X2 X3) (party sdate edate)
destring ?date, replace
gen diff = edate - sdate + 1
expand diff
bysort id party: replace sdate = sdate[1] + _n - 1
drop _stack X edate diff
// create indicator variables
tabulate party, gen(y)
// fix years with two or more parties
levelsof party, local(lp) clean
collapse (sum) y*, by(id sdate)
// rename
unab ly: y*
rename (`ly') (`lp')
list, sepby(id)