我有一个数据集
Period Brand ID
Jan A X1
Jan A K1
Jan B CT2
Feb C X2
Feb A P4
我想在每个时期为每个品牌做一个独特的ID计数。我尝试在proc sql中使用CASE WHEN来计算每个句点中的不同数字,但我不确定要为else部分设置什么,因为我猜测SAS也会将else部分计为不同的项目。我的代码如下:
proc sql;
create table items as
select period,
count(distinct case when brand="A" then ID else "." end) as Count_A,
count(distinct case when brand="B" then ID else "." end) as Count_B,
count(distinct case when brand="C" then ID else "." end) as Count_C
from Data
group by period;
quit;
我确实理解我能够使用子查询来构造每个计数变量,但代码可能会变得非常冗长乏味。
谢谢!
答案 0 :(得分:1)
我认为只要您对NULL
条件使用ELSE
,您的查询就会有效。您可以通过简单地不列出任何ELSE
条件来实现此目的,在这种情况下,默认情况下将使用NULL
。
select
period,
count(distinct case when brand = "A" then ID end) as Count_A,
count(distinct case when brand = "B" then ID end) as Count_B,
count(distinct case when brand = "C" then ID end) as Count_C
from Data
group by period
答案 1 :(得分:0)
这应该适合你: -
/*Creating Dataset*/
data a;
input Period $3. Brand$1. Id$3.@;
datalines;
JanAX1
JanAK1
JanBCT2
FebCX2
FebAP4
JanA
FebB
JanAX1
FebCX2
;
run;
/*Counting distinct by giving another condition for Id with brand should do
the job for you*/
proc sql;
create table items as
select period,
count(distinct case when (brand="A" and ID not=" ") then ID end) as
Count_A,
count(distinct case when (brand="B" and ID not=" ") then ID end) as
Count_B,
count(distinct case when (brand="C" and ID not=" ") then ID end) as
Count_C
from a
group by period;
quit;
希望这会有所帮助: - )
答案 2 :(得分:0)
如果您不想对每列进行硬编码,请将Brand添加到group by
语句中,然后转置结果。如果有很多不同的品牌,这是一种有用的方法。
data have;
input Period $ Brand $ ID $;
datalines;
Jan A X1
Jan A K1
Jan B CT2
Feb C X2
Feb A P4
;
run;
proc sql;
create table temp
as select
period,
brand,
count(distinct id) as count
from have
group by period, brand;
quit;
proc transpose data=temp out=items (drop=_NAME_) prefix=Count_;
by period;
id brand;
var count;
run;