计算变量属性时出现问题

时间:2018-08-30 11:32:21

标签: sql sas

我无法计算变量Bucket(字符变量)具有特定属性的次数。

data Bucket;
set Agreement8;
select;
    when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
    when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
    when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
    when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
    otherwise Bucket=">100";
end;
run;

然后我跑:

proc sql;
select count(*) as No_Obs_1
from Summary_Bucket
where Bucket="0-25";
;
quit;

proc sql;
select count(*) as No_Obs_2
from Summary_Bucket
where Bucket="26-50";
quit;

proc sql;
select count(*) as No_Obs_3
from Summary_Bucket
where Bucket="51-75";
;
quit;

proc sql;
select count(*) as No_Obs_4
from Summary_Bucket
where Bucket="76-100";
;
quit;

proc sql;
select count(*) as No_Obs_5
from Summary_Bucket
where Bucket=">100";
;
quit;

这是我的结果:

enter image description here

但是我显然拥有以下属性中的1个或零个以上:

enter image description here

3 个答案:

答案 0 :(得分:2)

您可以使用proc sql进行整个计算:

proc sql;
    select (case when ltv_max_on <= 0.25 then '0-25'
                 when ltv_max_on <= 0.50 then '26-50'
                 when ltv_max_on <= 0.75 then '51-75'
                 when ltv_max_on <= 1.00 then '51-75'
                 else '>100'
            end) as bucket,
           count(*)
    from Agreement8
    group by (case when ltv_max_on <= 0.25 then '0-25'
                   when ltv_max_on <= 0.50 then '26-50'
                   when ltv_max_on <= 0.75 then '51-75'
                   when ltv_max_on <= 1.00 then '51-75'
                   else '>100'
              end);

run;

不需要多个步骤。这是SQL的优点之一。

答案 1 :(得分:2)

几件事值得检查:

1)在分配值之前,请先指定值区的长度,因为您显示的表仅显示4个字符。例如,尝试以下操作:

data Bucket;
set Agreement8;
length Bucket $ 5; /* <- try adding this line */
select;
    when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
    when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
    when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
    when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
    otherwise Bucket=">100";
end;
run;

2)此外,您创建了一个名为Bucket的表,但是您的SQL引用的表Summary_Bucket看起来不一致。

答案 2 :(得分:1)

问题是您的值在存储桶中被截断,我只使用proc格式,所以一切都很简单,可以很轻松地显示

data have;
input n;
datalines;
0
10
25
35
45
55
75
85
95
106
;


proc format;
value newval
0 -  25 = '0-25'
26 - 50 = '26-50'
51 -  75 = '51-75'
76 - high = ">100"
;

proc sql;
select put(n,newval.) as ranges, count(*)
from have
 group by  1;