将R数据帧输出为SAS格式问题

时间:2017-09-27 20:06:47

标签: r sas

我有一个如下所示的数据集:

df_dummy = data.frame(
  Company=c("0001","0002","0003","0004","0005"),
  Measure=c("A","B","C","D","E"),
  Num=c(10,10,10,10,10),
  Den=c(20,20,20,20,20),
  Rate=c(50.0,50.0,50.0,50.0,50.0)
)

df_dummy$Company <- as.character(df_dummy$Company)
df_dummy$Measure <- as.character(df_dummy$Measure)

我使用它导出到.xpt文件

write.xport(df_dummy, file = "data/tmp.xpt")
lookup.xport("data/tmp.xpt")

在SAS中,我使用此代码导入:

libname sasfile 'PATH\data';
libname xptfile xport 'PATH\data\tmp.xpt' access=readonly;
proc copy inlib=xptfile outlib=sasfile;
run;

该表看起来很好,但速率没有显示小数点。

在我的实际数据集中,有更多的行,但它本质上是相同的格式,如果我运行lookup.xport,我得到这个:

Variables in data set `MEASURES':
  dataset    name      type format flength fdigits iformat iflength ifdigits label  nobs
 MEASURES   ID character              0       0                0        0       29064
 MEASURES MEASURE character              0       0                0        0       29064
 MEASURES     NUM   numeric              0       0                0        0       29064
 MEASURES     DEN   numeric              0       0                0        0       29064
 MEASURES    RATE   numeric              0       0                0        0       29064

然而,如果我使用相同的SAS代码来导入它,我会得到一些看起来完全关闭的东西,我无法弄清楚是什么导致它。

enter image description here

2 个答案:

答案 0 :(得分:1)

我无法在Mac OS X上使用R(3.4.1)和SAS(9.4 TS1M4)复制您的问题,两者都是64位版本。 32/64位版本有时会导致问题。 我使用了R Studio和SAS UE,它们都可以免费用于教育。

完整的R代码:

install.packages("SASxport")

library("SASxport")

df_dummy = data.frame(
  Company=c("0001","0002","0003","0004","0005"),
  Measure=c("A","B","C","D","E"),
  Num=c(10,10,10,10,10),
  Den=c(20,20,20,20,20),
  Rate=c(50.0,50.0,50.0,50.0,50.0)
)

df_dummy$Company <- as.character(df_dummy$Company)
df_dummy$Measure <- as.character(df_dummy$Measure)

write.xport(df_dummy, file = "tmp.xpt")

完整SAS代码:

libname sasfile '/folders/myfolders/';
libname xptfile xport '/folders/myfolders/tmp.xpt' access=readonly;
proc copy inlib=xptfile outlib=sasfile;
run;

答案 1 :(得分:0)

你的例子有效。即使使用旧版本或R,也要确保您的传输文件没有被机器之间的传输损坏。传输文件是具有固定长度80字节记录的二进制数据,但大部分数据看起来像ASCII码。

SAS传输文件遵循SAS V5规则的名称。确保您的成员名称和变量名称是有效的SAS名称,且不得超过8个字符。字符变量不能超过200个字符。

您可以使用简单的数据步骤快速查看文件。特别是对你的小例子。因此,如果您发现长度不是80的倍数,或者您看到标题记录没有从80字节记录的开头开始,则表明文件已损坏。

 56         data _null_;
 57           infile '/test/tmp.xpt' lrecl=80 recfm=f ;
 58           input;
 59           list;
 60         run;

 NOTE: The infile '/test/tmp.xpt' is:
       Filename=/test/tmp.xpt,
       Owner Name=xxxxx,Group Name=xxxxx,
       Access Permission=-rw-r--r--,
       Last Modified=29Sep2017:09:16:16,
       File Size (bytes)=1680

 RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      
 1         HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000  

 2   CHAR  SAS     SAS     SASLIB  7.00    R 3.0.2.                        29SEP17:09:16:16
     ZONE  54522222545222225454442232332222523232302222222222222222222222223354533333333333
     NUMR  3130000031300000313C92007E000000203E0E200000000000000000000000002935017A09A16A16
 3         29SEP17:09:16:16                                                                
 4         HEADER RECORD*******MEMBER  HEADER RECORD!!!!!!!000000000000000001600000000140  
 5         HEADER RECORD*******DSCRPTR HEADER RECORD!!!!!!!000000000000000000000000000000  

 6   CHAR  SAS     DF_DUMMYSASDATA 7.00    R 3.0.2.                        29SEP17:09:16:16
     ZONE  54522222445454455454454232332222523232302222222222222222222222223354533333333333
     NUMR  3130000046F45DD9313414107E000000203E0E200000000000000000000000002935017A09A16A16
 7         29SEP17:09:16:16                                                                
 8         HEADER RECORD*******NAMESTR HEADER RECORD!!!!!!!000000000500000000000000000000  

 9   CHAR  ........COMPANY                                                 ........        
     ZONE  00000000444544522222222222222222222222222222222222222222222222220000000022222222
     NUMR  020008013FD01E900000000000000000000000000000000000000000000000000000000000000000

 10  CHAR  ....................................................................MEASURE     
     ZONE  00000000000000000000000000000000000000000000000000000000000000000000444555422222
     NUMR  00000000000000000000000000000000000000000000000000000000000002000802D51352500000

 11  CHAR                                              ........        ....................
     ZONE  22222222222222222222222222222222222222222222000000002222222200000000000000000000
     NUMR  00000000000000000000000000000000000000000000000000000000000000000008000000000000

 12  CHAR  ................................................NUM                             
     ZONE  00000000000000000000000000000000000000000000000045422222222222222222222222222222
     NUMR  000000000000000000000000000000000000000001000803E5D00000000000000000000000000000

 13  CHAR                          ........        ........................................
     ZONE  22222222222222222222222200000000222222220000000100000000000000000000000000000000
     NUMR  00000000000000000000000000000000000000000000000000000000000000000000000000000000

 14  CHAR  ............................DEN                                                 
     ZONE  00000000000000000000000000004442222222222222222222222222222222222222222222222222
     NUMR  000000000000000000000100080445E0000000000000000000000000000000000000000000000000
 RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      

 15  CHAR      ........        ............................................................
     ZONE  22220000000022222222000000010000000000000000000000000000000000000000000000000000
     NUMR  00000000000000000000000000080000000000000000000000000000000000000000000000000000

 16  CHAR  ........RATE                                                    ........        
     ZONE  00000000545422222222222222222222222222222222222222222222222222220000000022222222
     NUMR  01000805214500000000000000000000000000000000000000000000000000000000000000000000

 17  CHAR  ....... ....................................................                    
     ZONE  00000002000000000000000000000000000000000000000000000000000022222222222222222222
     NUMR  00000000000000000000000000000000000000000000000000000000000000000000000000000000
 18        HEADER RECORD*******OBS     HEADER RECORD!!!!!!!000000000000000000000000000000  

 19  CHAR  0001    A       A ......B.......B2......0002    B       A ......B.......B2......
     ZONE  33332222422222224A000000410000004300000033332222422222224A0000004100000043000000
     NUMR  00010000100000001000000024000000220000000002000020000000100000002400000022000000

 20  CHAR  0003    C       A ......B.......B2......0004    D       A ......B.......B2......
     ZONE  33332222422222224A000000410000004300000033332222422222224A0000004100000043000000
     NUMR  00030000300000001000000024000000220000000004000040000000100000002400000022000000

 21  CHAR  0005    E       A ......B.......B2......                                        
     ZONE  33332222422222224A00000041000000430000002222222222222222222222222222222222222222
     NUMR  00050000500000001000000024000000220000000000000000000000000000000000000000000000
 NOTE: 21 records were read from the infile '/test/tmp.xpt'.