我无法计算变量Bucket(字符变量)具有特定属性的次数。
data Bucket;
set Agreement8;
select;
when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
otherwise Bucket=">100";
end;
run;
然后我跑:
proc sql;
select count(*) as No_Obs_1
from Summary_Bucket
where Bucket="0-25";
;
quit;
proc sql;
select count(*) as No_Obs_2
from Summary_Bucket
where Bucket="26-50";
quit;
proc sql;
select count(*) as No_Obs_3
from Summary_Bucket
where Bucket="51-75";
;
quit;
proc sql;
select count(*) as No_Obs_4
from Summary_Bucket
where Bucket="76-100";
;
quit;
proc sql;
select count(*) as No_Obs_5
from Summary_Bucket
where Bucket=">100";
;
quit;
这是我的结果:
但是我显然拥有以下属性中的1个或零个以上:
答案 0 :(得分:2)
您可以使用proc sql
进行整个计算:
proc sql;
select (case when ltv_max_on <= 0.25 then '0-25'
when ltv_max_on <= 0.50 then '26-50'
when ltv_max_on <= 0.75 then '51-75'
when ltv_max_on <= 1.00 then '51-75'
else '>100'
end) as bucket,
count(*)
from Agreement8
group by (case when ltv_max_on <= 0.25 then '0-25'
when ltv_max_on <= 0.50 then '26-50'
when ltv_max_on <= 0.75 then '51-75'
when ltv_max_on <= 1.00 then '51-75'
else '>100'
end);
run;
不需要多个步骤。这是SQL的优点之一。
答案 1 :(得分:2)
几件事值得检查:
1)在分配值之前,请先指定值区的长度,因为您显示的表仅显示4个字符。例如,尝试以下操作:
data Bucket;
set Agreement8;
length Bucket $ 5; /* <- try adding this line */
select;
when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
otherwise Bucket=">100";
end;
run;
2)此外,您创建了一个名为Bucket
的表,但是您的SQL引用的表Summary_Bucket
看起来不一致。
答案 2 :(得分:1)
问题是您的值在存储桶中被截断,我只使用proc格式,所以一切都很简单,可以很轻松地显示
data have;
input n;
datalines;
0
10
25
35
45
55
75
85
95
106
;
proc format;
value newval
0 - 25 = '0-25'
26 - 50 = '26-50'
51 - 75 = '51-75'
76 - high = ">100"
;
proc sql;
select put(n,newval.) as ranges, count(*)
from have
group by 1;