识别第一个事件或最后一个非事件

时间:2018-11-07 00:51:00

标签: stata

我在Stata中有以下数据:

clear

* Input data
input float id str7 event time
id  event   time
1   "." 10
1   "." 20
1   "1" 30
1   "0" 40
1   "." 50
2   "0" 10
2   "0" 20
2   "0" 30
2   "0" 40
2   "0" 50
3   "1" 10
3   "1" 20
3   "0" 30
3   "." 40
3   "." 50
4   "." 10
4   "." 20
4   "." 30
4   "." 40
4   "." 50
5   "1" 10
5   "1" 20
5   "1" 30
5   "1" 40
5   "1" 50     
end

以下是我希望获取的数据:

* Input data
input float id str7 event time
id1 event1  time1
1   1   30
2   0   50
3   1   10
4   .   50
5   1   10

end

我的目标是使每个id事件等于1的第一行。如果id在任何时间都没有事件,那么我想报告该报告的最后一次。

3 个答案:

答案 0 :(得分:3)

这是另一种方法:

bysort id (time): egen when_first_1 = min(cond(event == "1", time, .))
by id: gen tokeep = cond(when_first_1 == ., time == time[_N], time == when_first_1) 
keep if tokeep 
drop tokeep 

尤其参见this paper中的第9节。

答案 1 :(得分:2)

以下对我有用:

replace event = "-1" if event == "1"

bysort id (event time): generate tag1 = event[_n==1] == "-1" 
bysort id (event time): generate tag2 = event[_n==_N] == "0" 
bysort id (event time): generate tag3 = event[_n==_N] == "."

replace event = "1" if event == "-1"
keep if tag1 == 1 | tag2 == 1 | tag3 == 1

list

     +----------------------------------------+
     | id   event   time   tag1   tag2   tag3 |
     |----------------------------------------|
  1. |  1       1     30      1      0      0 |
  2. |  2       0     50      0      1      0 |
  3. |  3       1     10      1      0      0 |
  4. |  4       .     50      0      0      1 |
  5. |  5       1     10      1      0      0 |
     +----------------------------------------+

答案 2 :(得分:1)

这是基于Romalpa Akzo on Statalist的回答:

bys id (time): gen tag = 1 if  event == "1" | _n ==_N
bys id (tag time): keep if _n == 1
drop tag

我认为这是迄今为止最简洁的答案。请注意,如果不是tag1就会丢失。