Question

我有一个数据集

Period  Brand   ID
  Jan    A      X1
  Jan    A      K1
  Jan    B      CT2
  Feb    C      X2
  Feb    A      P4

我想在每个时期为每个品牌做一个独特的ID计数。我尝试在proc sql中使用CASE WHEN来计算每个句点中的不同数字，但我不确定要为else部分设置什么，因为我猜测SAS也会将else部分计为不同的项目。我的代码如下：

 proc sql;

    create table items as
    select period,
    count(distinct case when brand="A" then ID else "." end) as Count_A,
    count(distinct case when brand="B" then ID else "." end) as Count_B,
    count(distinct case when brand="C" then ID else "." end) as Count_C
from Data
group by period;
quit;

我确实理解我能够使用子查询来构造每个计数变量，但代码可能会变得非常冗长乏味。

谢谢！

Answer 1

我认为只要您对NULL条件使用ELSE，您的查询就会有效。您可以通过简单地不列出任何ELSE条件来实现此目的，在这种情况下，默认情况下将使用NULL。

select
    period,
    count(distinct case when brand = "A" then ID end) as Count_A,
    count(distinct case when brand = "B" then ID end) as Count_B,
    count(distinct case when brand = "C" then ID end) as Count_C
from Data
group by period

Answer 2

这应该适合你： -

/*Creating Dataset*/
data a;
input Period $3. Brand$1. Id$3.@;
datalines;
JanAX1
JanAK1
JanBCT2
FebCX2
FebAP4
JanA
FebB
JanAX1
FebCX2
;
run;

/*Counting distinct by giving another condition for Id with brand should do 
  the job for you*/
 proc sql;

    create table items as
    select period,
    count(distinct case when (brand="A" and ID not=" ") then ID end) as 
    Count_A,
    count(distinct case when (brand="B" and ID not=" ") then ID end) as 
    Count_B,
    count(distinct case when (brand="C" and ID not=" ") then ID end) as 
    Count_C
from a
group by period;
quit;

希望这会有所帮助： - ）

Answer 3

如果您不想对每列进行硬编码，请将Brand添加到group by语句中，然后转置结果。如果有很多不同的品牌，这是一种有用的方法。

data have;
input Period $ Brand $ ID $;
datalines;
Jan    A      X1
Jan    A      K1
Jan    B      CT2
Feb    C      X2
Feb    A      P4    
;
run;

proc sql;
create table temp
as select
    period,
    brand,
    count(distinct id) as count
from have
group by period, brand;
quit;

proc transpose data=temp out=items (drop=_NAME_) prefix=Count_;
by period;
id brand;
var count;
run;

在proc sql中为Case设置Null值

3 个答案: