选择带有值的第一条记录

时间:2014-02-05 14:13:35

标签: stata

nparasit    numserial   ndate_added
    .       42231       05-Jun-00
    7992    42231       03-Jun-00
    .       422420       4-Jun-00
    144000  42242       05-Jun-00
    712800  42242       04-Jun-00
    NEG     42242       08-Jun-00
    371200  42242       06-Jun-00
    10138   42242       07-Jun-00
    .       110224      21-Dec-11
    0       110224      12-Dec-11

我在Windows 7上使用Stata 12。我有一个重复的数据集 numserials。我想根据一个值选择第一个nparasit 添加日期(ndate_added)。我尝试过使用bysort但没有 成功。我该怎么办呢

1 个答案:

答案 0 :(得分:1)

// prepare some example data
clear
input ///
nparasit    numserial   str9 ndate_added
    .       42231       02-Jun-00
    7992    42231       03-Jun-00
    .       422420      04-Jun-00
    144000  42242       05-Jun-00
    712800  42242       04-Jun-00
    .       42242       08-Jun-00
    371200  42242       06-Jun-00
    10138   42242       07-Jun-00
    .       110224      21-Dec-11
    0       110224      12-Dec-11
end

gen date = date(ndate_added, "DM20Y")
format date %td

// mark first non-missing value on nparasit
gen byte miss = missing(nparasit)
bysort miss numserial (date) : ///
   gen byte mark = ( _n == 1) & ( miss == 0 )

// admire the result
sort numserial date
list, sepby(numserial)