如何将SAS数据集转换为数据步骤

时间:2018-11-02 20:22:00

标签: sas

如何将SAS数据集转换为可以轻松粘贴到论坛中或移交给某人以复制我的数据的数据集。理想情况下,我还希望能够控制其中包含的记录数量。

即我在SASHELP库中有sashelp.class,但我想在这里提供它,以便其他人可以将其用作我的问题的起点。

2 个答案:

答案 0 :(得分:5)

为此,您可以使用Mark Jordan在SAS编写的宏,该代码也存储在GitHub中。

您需要提供数据集名称,包括库和要输出的观测值数量。它使他们井然有序。该代码将出现在您的SAS日志中。

*data set you want to create demo data for;
%let dataSetName = sashelp.Class;
*number of observations you want to keep;
%let obsKeep = 5;


******************************************************
DO NOT CHANGE ANYTHING BELOW THIS LINE
******************************************************;

%let source_path = https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f5150b8e887218dde0fc3951d0ff15b/data2datastep.sas;

filename reprex url "&source_path";
%include reprex;
filename reprex;

option linesize=max;
%data2datastep(dsn=&dataSetName, obs=&obsKeep);

如果您无权访问github页面,则此方法可能不起作用,在这种情况下,您可以手动导航至该页面(相同链接)并将其复制/粘贴到SAS中。然后运行程序,仅运行最后一步%data2datastep(dsn=, obs=);

答案 1 :(得分:3)

这个话题最近在SAS社区出现,我创建了比Reeza链接的宏更健壮的宏。您可以在Github中看到它:ds2post.sas

* Pull macro definition from GITHUB ;
filename ds2post url
  'https://raw.githubusercontent.com/sasutils/macros/master/ds2post.sas'
;
%include ds2post ;

例如,如果您想共享SASHELP.CARS的前5个观察结果,则可以运行此宏调用:

%ds2post(sashelp.cars,obs=5)

这将在SAS日志中生成以下代码:

data work.cars (label='2004 Car Data');
  infile datalines dsd dlm='|' truncover;
  input Make :$13. Model :$40. Type :$8. Origin :$6. DriveTrain :$5.
    MSRP Invoice EngineSize Cylinders Horsepower MPG_City MPG_Highway
    Weight Wheelbase Length
  ;
  format MSRP dollar8. Invoice dollar8. ;
  label EngineSize='Engine Size (L)' MPG_City='MPG (City)'
    MPG_Highway='MPG (Highway)' Weight='Weight (LBS)'
    Wheelbase='Wheelbase (IN)' Length='Length (IN)'
  ;
datalines4;
Acura|MDX|SUV|Asia|All|36945|33337|3.5|6|265|17|23|4451|106|189
Acura|RSX Type S 2dr|Sedan|Asia|Front|23820|21761|2|4|200|24|31|2778|101|172
Acura|TSX 4dr|Sedan|Asia|Front|26990|24647|2.4|4|200|22|29|3230|105|183
Acura|TL 4dr|Sedan|Asia|Front|33195|30299|3.2|6|270|20|28|3575|108|186
Acura|3.5 RL 4dr|Sedan|Asia|Front|43755|39014|3.5|6|225|18|24|3880|115|197
;;;;

尝试这个小测试来比较两个宏。

首先制作一个有两个问题的样本数据集。

data testit;
  set sashelp.class (obs=5);
  if _n_=1 then name='Le Bron';
  if _n_=2 then age=.;
  if _n_=3 then wt=.;
  if _n_=4 then name='12;34';
run;

然后运行两个宏以将代码转储到SAS日志中。

%ds2post(testit);
%data2datastep(dsn=testit,obs=20);

从日志中复制代码。更改DATA语句中的名称以不覆盖原始数据集或彼此覆盖。运行它们并将结果与​​原始结果进行比较。

proc compare data=testit compare=testit1; run;
proc compare data=testit compare=testit2; run;

使用%DS2POST的结果:

The COMPARE Procedure
Comparison of WORK.TESTIT with WORK.TESTIT1
(Method=EXACT)

Data Set Summary

Dataset                Created          Modified  NVar    NObs

WORK.TESTIT   02NOV18:17:09:40  02NOV18:17:09:40     6       5
WORK.TESTIT1  02NOV18:17:10:29  02NOV18:17:10:29     6       5

Variables Summary

Number of Variables in Common: 6.

Observation Summary

Observation      Base  Compare

First Obs           1        1
Last  Obs           5        5

Number of Observations in Common: 5.
Total Number of Observations Read from WORK.TESTIT: 5.
Total Number of Observations Read from WORK.TESTIT1: 5.

Number of Observations with Some Compared Variables Unequal: 0.
Number of Observations with All Compared Variables Equal: 5.

使用%Data2DataStep的结果摘要:

Comparison of WORK.TESTIT with WORK.TESTIT2
(Method=EXACT)

Data Set Summary

Dataset                Created          Modified  NVar    NObs

WORK.TESTIT   02NOV18:17:09:40  02NOV18:17:09:40     6       5
WORK.TESTIT2  02NOV18:17:10:29  02NOV18:17:10:29     6       3


Variables Summary

Number of Variables in Common: 6.


Observation Summary

Observation      Base  Compare

First Obs           1        1
First Unequal       1        1
Last  Unequal       3        3
Last  Match         3        3
Last  Obs           5        .

Number of Observations in Common: 3.
Number of Observations in WORK.TESTIT but not in WORK.TESTIT2: 2.
Total Number of Observations Read from WORK.TESTIT: 5.
Total Number of Observations Read from WORK.TESTIT2: 3.

Number of Observations with Some Compared Variables Unequal: 3.
Number of Observations with All Compared Variables Equal: 0.

变量值摘要

Values Comparison Summary

Number of Variables Compared with All Observations Equal: 1.
Number of Variables Compared with Some Observations Unequal: 5.
Number of Variables with Missing Value Differences: 4.
Total Number of Values which Compare Unequal: 12.
Maximum Difference: 0.


Variables with Unequal Values

Variable  Type  Len  Ndif   MaxDif  MissDif

Name      CHAR    8     1                 0
Sex       CHAR    1     3                 3
Age       NUM     8     2        0        2
Height    NUM     8     3        0        3
Weight    NUM     8     3        0        3

请注意,我确定有些值也会给我的宏造成麻烦。但是希望它们是由与空格或分号相比不太可能出现的数据引起的。