我在sas sql工作。
我有一个脚本,它总是根据时间间隔生成不同数量的表。对于每一天的一个表,除了列平衡之外,列的名称是相同的。此列名称包含日期。表的名称是TableName_date,如TableName_07092017 ... TableName_31092017
MT column RT AREA balance_07092017
ACCOUNTS balance_lcy 30 2004862772
ACCOUNTS balance_lcy 30 CA 121390255,8
ACCOUNTS balance_lcy 30 GL 323499587
ACCOUNTS balance_lcy 30 TF -7821721555
C_ACCOUNTS balance_lcy 35 CA 2,49733E+11
C_ACCOUNTS balance_lcy 35 NO 3748192715
MT column RT AREA balance_08092017
ACCOUNTS balance_lcy 30 -24278162321
ACCOUNTS balance_lcy 30 CA 225363070.05
ACCOUNTS balance_lcy 30 GL 3117815863.7
ACCOUNTS balance_lcy 30 TF 47914289803
C_ACCOUNTS balance_lcy 35 CA 37637391174
C_ACCOUNTS balance_lcy 35 NO 163722935.2
是否可以创建一个自动连接这些表的脚本?他们应该看起来像这样
MT column RT AREA balance_07092017 balance_08092017 balance_09092017 balance_10092017....
ACCOUNTS balance_lcy 30 2004862772 -24278162321
ACCOUNTS balance_lcy 30 CA 121390255,8 225363070.05
ACCOUNTS balance_lcy 30 GL 323499587 3117815863.7
ACCOUNTS balance_lcy 30 TF -7821721555 47914289803
C_ACCOUNTS balance_lcy 35 CA 2,49733E+11 37637391174
C_ACCOUNTS balance_lcy 35 NO 3748192715 163722935.2
这是创建我需要加入的表的代码
%macro sqlloop(start,end);
PROC SQL;
%DO DT_REP=&start. %TO &end.;
%let year=%sysfunc(year(&DT_REP.));
%let month=%sysfunc(month(&DT_REP.));
%let month1=%sysfunc(PUTN(&month.,z2.));
%let day=%sysfunc(day(&DT_REP.));
%let day1=%sysfunc(PUTN(&day.,z2.));
%let datum= &day1.&month1.&year.;
%put &datum.;
CREATE TABLE DUPLICITY_BAL_&datum. as
select 'ACCOUNTS' as MT, 'balance_lcy' as column, rec_type, area, sum(balance_lcy) as balance_lcy, count(balance_lcy) as count
from database.ACCOUNTS
where version_no = 1
and dt_rep = &DT_REP.
group by rec_type, area
union all
select 'C_ACCOUNTS' as MT, 'balance_lcy' as column, rec_type, area, sum(balance_lcy) as balance_lcy, count(balance_lcy) as count
from database.C_ACCOUNTS
where version_no = 1
and dt_rep = &DT_REP.
group by rec_type, area;
%END;
QUIT;
%mend;
%sqlloop(start=21070, end=21073)
答案 0 :(得分:1)
为什么要使用SQL呢?使用SAS代码合并多个数据集非常简单。如果要合并以TableName_
开头的所有数据集,请使用:
通配符以避免键入单个数据集名称。
data want ;
merge TableName_: ;
by MT column RT AREA ;
run;
为什么要将数据分成多个表开头?为什么不一次只生成所有日期?
%macro sqlloop(start,end);
PROC SQL;
CREATE TABLE DUPLICITY_BAL as
select 'ACCOUNTS' as MT
, 'balance_lcy' as column
, dt_rep
, rec_type
, area
, sum(balance_lcy) as balance_lcy
, count(balance_lcy) as count
from database.ACCOUNTS
where version_no = 1
and dt_rep between &start and &end
group by 1,2,3,4,5
union all
select 'C_ACCOUNTS' as MT
, 'balance_lcy' as column
, dt_rep
, rec_type
, area
, sum(balance_lcy) as balance_lcy
, count(balance_lcy) as count
from database.C_ACCOUNTS
where version_no = 1
and dt_rep between &start and &end
group by 1,2,3,4,5
;
quit;
%mend sqlloop;
答案 1 :(得分:0)
首先不要这样做,保持数据的长格式。保持数据长,意味着您的表结构不会不断变化。您的报告结构可能会发生变化 - 但如果数据保持较长,则从数据库角度来看更容易。宽格式也违反了“整洁数据”的原则。
附加表格并将balance_date重命名为余额,或者在所有数据集中重复相同的内容。
如果您希望使用广泛的报表使用PROC REPORT。如果您真的,真的,(并且您不应该)想要以宽格式存储它,那么您可以在使用带有ID和/或IDLABEL的PROC TRANSPOSE后转置它以创建相同的结构。