我在Stata中有以下数据:
clear
* Input data
input float id str7 event time
id event time
1 "." 10
1 "." 20
1 "1" 30
1 "0" 40
1 "." 50
2 "0" 10
2 "0" 20
2 "0" 30
2 "0" 40
2 "0" 50
3 "1" 10
3 "1" 20
3 "0" 30
3 "." 40
3 "." 50
4 "." 10
4 "." 20
4 "." 30
4 "." 40
4 "." 50
5 "1" 10
5 "1" 20
5 "1" 30
5 "1" 40
5 "1" 50
end
以下是我希望获取的数据:
* Input data
input float id str7 event time
id1 event1 time1
1 1 30
2 0 50
3 1 10
4 . 50
5 1 10
end
我的目标是使每个id
事件等于1
的第一行。如果id
在任何时间都没有事件,那么我想报告该报告的最后一次。
答案 0 :(得分:3)
这是另一种方法:
bysort id (time): egen when_first_1 = min(cond(event == "1", time, .))
by id: gen tokeep = cond(when_first_1 == ., time == time[_N], time == when_first_1)
keep if tokeep
drop tokeep
尤其参见this paper中的第9节。
答案 1 :(得分:2)
以下对我有用:
replace event = "-1" if event == "1"
bysort id (event time): generate tag1 = event[_n==1] == "-1"
bysort id (event time): generate tag2 = event[_n==_N] == "0"
bysort id (event time): generate tag3 = event[_n==_N] == "."
replace event = "1" if event == "-1"
keep if tag1 == 1 | tag2 == 1 | tag3 == 1
list
+----------------------------------------+
| id event time tag1 tag2 tag3 |
|----------------------------------------|
1. | 1 1 30 1 0 0 |
2. | 2 0 50 0 1 0 |
3. | 3 1 10 1 0 0 |
4. | 4 . 50 0 0 1 |
5. | 5 1 10 1 0 0 |
+----------------------------------------+
答案 2 :(得分:1)
这是基于Romalpa Akzo on Statalist的回答:
bys id (time): gen tag = 1 if event == "1" | _n ==_N
bys id (tag time): keep if _n == 1
drop tag
我认为这是迄今为止最简洁的答案。请注意,如果不是tag
,1
就会丢失。