我有一个包含四个变量的表,我希望表中包含所有值组合的表。显示仅包含2列的表格。
NAME AMOUNT COUNT
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
以下输出仅显示raj的值,输出应为所有名称。
NAME AMOUNT COUNT
RAJ 90 1
RAJ 90 4
RAJ 90 5
RAJ 90 3
RAJ 20 1
RAJ 20 4
RAJ 20 5
RAJ 20 3
RAJ 30 1
RAJ 30 4
RAJ 30 5
RAJ 30 3
RAJ 40 1
RAJ 40 4
RAJ 40 5
RAJ 40 3
.
.
.
.
答案 0 :(得分:3)
SAS中有几个有用的选项可以做到这一点;两者都创建一个包含所有可能的变量组合的表格,然后您可以删除您不需要的摘要数据。给出您的初始数据集:
data have;
input NAME $ AMOUNT COUNT;
datalines;
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
;;;;
run;
PROC FREQ
有SPARSE
。
proc freq data=have noprint;
tables name*amount*count/sparse out=want(drop=percent);
run;
还有PROC TABULATE。
proc tabulate data=have out=want(keep=name amount count);
class name amount count;
tables name*amount,count /printmiss;
run;
这样做的好处是不会与COUNT变量的名称冲突。
答案 1 :(得分:1)
尝试
PROC SQL;
CREATE TABLE tbl_out AS
SELECT a.name AS name
,b.amount AS amount
,c.count AS count
FROM tbl_in AS a, tbl_in AS b, tbl_in AS c
;
QUIT;
这会执行双重自我连接,并且应该具有所需的效果。
答案 2 :(得分:0)
这是@ JustinJDavies答案的变体,使用明确的CROSS JOIN
条款:
data have;
input NAME $ AMOUNT COUNT;
datalines;
RAJ 90 1
RAVI 20 4
JOHN 30 5
JOSEPH 40 3
run;
PROC SQL;
create table combs as
select *
from have(keep=NAME)
cross join have(keep=AMOUNT)
cross join have(keep=COUNT)
order by name, amount, count;
QUIT;
结果:
NAME AMOUNT COUNT
JOHN 20 1
JOHN 20 3
JOHN 20 4
JOHN 20 5
JOHN 30 1
JOHN 30 3
JOHN 30 4
JOHN 30 5
...