SAS - 将原始表名称添加为报告中的列

时间:2017-05-10 13:19:18

标签: sas report

我有一个输出表,其中包含来自30个不同表的300多个变量,这些表由UNION连接,用于建模。我创建了一个宏,使用此输出表创建一个包含大量统计信息的报表,例如均值,最小值/最大值等。我试图在报告中添加一列,详细说明变量来自哪个表。我说表是因为一些变量是在不同的表之间共享的。我想避免在报告中多次使用相同的变量,因为统计数据是相同的,无论变量来自哪个表。有没有一种有效的方法呢?

3 个答案:

答案 0 :(得分:0)

如果是我,我会遍历每个union数据集,只需将表名和变量名放入已编译的数据集中。您可能将所有表名都放在宏列表中或键入,因此您只需添加几行代码就可以在每个表上运行proc contents来编译表和变量名的完整列表。请注意,与您的示例一样,在编译表之后,您可以修改重复的变量名称:

** create different tables **;
data height; set sashelp.class(keep=name height); run;
data weight; set sashelp.class(keep=name weight); run;
data sex; set sashelp.class(keep=name sex); run;

** put your datasets into a list either manually or dynamically **;
/* manually */
%let ds_list=height weight sex; 

/* dynamically -- be careful to include only tables in your union */
proc sql noprint;
    select MEMNAME
    into: ds_list separated by " "
    from sashelp.vmember
    where libname = "WORK" and memname not in ("SASMACR","FORMATS");
quit;

%put &ds_list.;

** loop over each table to put the table name and variables in a dataset **;
%MACRO get_names(ds_list);
%do i=1 %to %sysfunc(countw(&ds_list.));
    %let ds = %scan(&ds_list.,&i.);
    proc contents data = &ds. noprint 
        out=names_&ds.(keep=MEMNAME NAME rename=(MEMNAME=SOURCE_DATASET));
    run;

    proc append data = names_&ds. base=full force; run;
%end;
%MEND;

%get_names(&ds_list.);

答案 1 :(得分:0)

而不是UNION考虑使用DATA STEP,然后使用INDSNAME选项。

data want;
set sashelp.class sashelp.cars indsname=source;
source_dataset = source;
run;

答案 2 :(得分:0)

我设法使用以下方法执行此操作:

使用源表创建表。

PROC SQL;
CREATE TABLE    SOURCES AS
SELECT          NAME
                ,MEMNAME
FROM            DICTIONARY.COLUMNS
WHERE           LIBNAME='LIBNAME'
ORDER BY 1,2;
RUN;

加入我的统计表。

PROC SQL;
CREATE TABLE    STATS_NEW AS
SELECT          memname AS TABLE_NAME,a.*
FROM            STATS a
LEFT JOIN       SOURCES b
ON              a.name = b.name
GROUP BY        a.name
ORDER BY        a.name;
QUIT;

转置数据并添加逗号分隔符。

DATA            STATS_TRANSPOSE (drop=TABLE_NAME);
LENGTH          INPUT_TABLES $1000;
SET             STATS_NEW;
BY              name;

RETAIN          INPUT_TABLES;
IF              FIRST.name THEN DO; INPUT_TABLES=TABLE_NAME; END;

IF NOT          FIRST.name
THEN DO;
            INPUT_TABLES=CATS(INPUT_TABLES,', ',TABLE_NAME);
END;
IF              LAST.name THEN DO;
IF              name IN ('FIELD1','FIELD2')
THEN DO;        INPUT_TABLES='ALL'; END;
OUTPUT;
END;
RUN;