计数组但在Stata中跳过空

时间:2017-02-28 22:59:09

标签: stata

我的数据看起来像这样

clear
input str5 name long id str3 place byte count str5 name_label byte(counting counting_ideal)
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Jeff"  110711108 "ABC" 0 ""      2 .
"Jeff"  110711114 "ABC" 0 ""      3 .
"Jeff"  110711104 "ABC" 0 ""      4 .
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Jess"  110712102 "ABC" 1 "Jess"  5 2
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
"Matt"  110712101 "ABC" 1 "Matt"  6 3
end

如何生成counting_ideal列?请注意,我有几个place变量,因此我之后的命令应指定place=="ABC"

我试过了:

encode name if count>0 & place=="ABC", gen(counting) 

..但这会产生一个连续计数,不会忽略空的“name_label”观察。

1 个答案:

答案 0 :(得分:0)

给出您的(欢迎)数据示例

gen wanted = cond(name_label == "", ., sum(name_label != name_label[_n-1] & name_label != "")) 

以伪代码解构:

if name_label is empty { 
   return missing 
} 
else return the count of runs of name_labels so far 

这里的运行只是一系列相同的非缺失值。因此,我们通过与先前名称标签不同(并且不为空)的条件来识别新的名称标签:

name_label != name_label[_n-1] & name_label != "" 

这是一个真或假的表达式,计算为1或0. sum()产生的累积和在满足1时增加1,否则在满足0时保持不变,这正是数数是。

in this paper讨论了处理此类行动或法术背后的原则。

您对place变量的规定并未得到真正解释,但如果您想为place的不同值分别编号,则应完成整个操作

gen long obs = _n 
bysort place (obs): gen wanted = ...