有人对以下内容有好的解决办法吗?我需要总结一下观察值,并在“总计”的总和达到1000时取“ dist”的值;
data DATA ;
input ID $ dist total ;
cards ;
A 1.5 600
A 2.5 500
A 3.0 200
B 2.8 1050
B 6.8 100
C 0.8 900
C 1.2 150
C 3.5 300
; run;
Desired output with the third column being optional:
A 2.5 1100
B 2.8 1050
C 1.2 1050
答案 0 :(得分:2)
通过处理并保留累计总数和输出标志:
data want ; set have ; by ID ; /* assumes your data is sorted already by ID */ retain cumtot _out . ; if first.ID then call missing(cumtot,_out) ; cumtot + total ; if cumtot >= 1000 and not _out then do ; _out = 1 ; /* set flag so we don't output further records for this ID */ output ; end ; drop _: ; run ;
我还要避免将数据集命名为“数据”。