我的数据看起来像这样
clear
input str5 name long id str3 place byte count str5 name_label byte(counting counting_ideal)
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Steve" 110821105 "ABC" 1 "Steve" 1 1
"Jeff" 110711108 "ABC" 0 "" 2 .
"Jeff" 110711114 "ABC" 0 "" 3 .
"Jeff" 110711104 "ABC" 0 "" 4 .
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Jess" 110712102 "ABC" 1 "Jess" 5 2
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
"Matt" 110712101 "ABC" 1 "Matt" 6 3
end
如何生成counting_ideal
列?请注意,我有几个place
变量,因此我之后的命令应指定place=="ABC"
。
我试过了:
encode name if count>0 & place=="ABC", gen(counting)
..但这会产生一个连续计数,不会忽略空的“name_label”观察。
答案 0 :(得分:0)
给出您的(欢迎)数据示例
gen wanted = cond(name_label == "", ., sum(name_label != name_label[_n-1] & name_label != ""))
以伪代码解构:
if name_label is empty {
return missing
}
else return the count of runs of name_labels so far
这里的运行只是一系列相同的非缺失值。因此,我们通过与先前名称标签不同(并且不为空)的条件来识别新的名称标签:
name_label != name_label[_n-1] & name_label != ""
这是一个真或假的表达式,计算为1或0. sum()
产生的累积和在满足1时增加1,否则在满足0时保持不变,这正是数数是。
in this paper讨论了处理此类行动或法术背后的原则。
您对place
变量的规定并未得到真正解释,但如果您想为place
的不同值分别编号,则应完成整个操作
gen long obs = _n
bysort place (obs): gen wanted = ...