说我有此MWE数据:
data v;
input var1 $ var2 var3 $;
datalines;
cat 3 yes
sheep 2 no
sheep 3 maybe
pig 3 maybe
goat 3 maybe
cat 2 no
pig 1 no
cat 2 no
pig 1 no
goat 3 no
cat 3 no
cat 2 yes
cat 1 yes
sheep 3 no
cat 2 no
cat 1 maybe
;
run;
我使用proc制表来计算每个值的观察次数。我为每个变量执行此操作:
proc tabulate data=v;
class var1;
table (var1='' all="Total"),(N pctn);
quit;
proc tabulate data=v;
class var2;
table (var2='' all="Total"),(N pctn);
quit;
proc tabulate data=v;
class var3;
table (var3='' all="Total"),(N pctn);
quit;
我得到的输出如下所示:
N PctN
cat 8 50.00
goat 2 12.50
pig 3 18.75
sheep 3 18.75
Total 16 100.00
N PctN
1 4 25.00
2 5 31.25
3 7 43.75
Total 16 100.00
N PctN
maybe 4 25.00
no 9 56.25
yes 3 18.75
Total 16 100.00
我的问题是: 如何以以下格式将其导出到Excel?
Name Cat 1 N1 N1% Cat 2 N2 N2% Cat 3 N3 N3% Cat 4 N4 N4% Missing % Total Total%
var1 cat 8 50 goat 2 12.5 pig 3 18.75 sheep 3 18.75 0 16 100
var2 1 4 25 2 5 31.25 3 7 43.75 0 16 100
var3 maybe 4 25 no 9 56.25 yes 3 18.75 0 16 100
换句话说,我希望每个不同的变量都有自己的行。变量的每个值都将出现在此行中,其中包含观察值的数量和总观察值的百分比。最后三列是奖励,但不是必需的:缺失的观察值的百分比和数量以及变量值的总数。我该怎么办?
请注意,我是SAS的新手。也欢迎对代码进行任何改进,例如如何循环或压缩代码以生成表。
答案 0 :(得分:1)
所需的数据格式极其混乱,并且随着变量数量和其不同值的数量增加而难以使用。
可以执行以下处理步骤以实现输出结构:
示例
数据具有第四个变量,但缺少一些值。
data have;
input var1 $ var2 var3 $ var4;
datalines;
cat 3 yes .
sheep 2 no .
sheep 3 maybe .
pig 3 maybe .
goat 3 maybe 1
cat 2 no 1
pig 1 no 1
cat 2 no 1
pig 1 no 1
goat 3 no 1
cat 3 no 1
cat 2 yes 1
cat 1 yes 1
sheep 3 no 1
cat 2 no 2
cat 1 maybe 1
;
run;
options missing = ' ';
proc transpose data=have_v out=vector1(index=(_name_));
by rowid;
var var1 var2 var3 var4;
run;
proc freq noprint data=vector1;
by _name_;
table col1 / missing out=freqs;
run;
options missing = '.';
data freqs_0;
set freqs;
by _name_;
retain nomiss;
if first._name_ then nomiss = not missing(col1);
if first._name_ then seq=1; else seq+1;
seqc = cats(seq);
if first._name_ and missing(col1) then do;
seqc = 'missing';
seq = 0;
end;
length widename $32;
if seqc ne 'missing' then do;
widename = cats("cat_",seqc);
widevalue = col1;
output;
end;
widename = cats("cat_",seqc,'_COUNT');
widevalue = COUNT;
output;
widename = cats("cat_",seqc,'_PERCENT');
widevalue = PERCENT;
output;
if last._name_ and nomiss then do;
seqc = 'missing';
widename = cats("cat_",seqc,'_COUNT');
widevalue = 0;
output;
widename = cats("cat_",seqc,'_PERCENT');
widevalue = 0;
output;
end;
keep _name_ widename widevalue;
run;
proc transpose data=freqs_0 out=wide;
by _name_;
id widename;
var widevalue;
run;