SAS中的Blob字段被截断

时间:2014-07-23 19:24:44

标签: sas data-integration

我一直在从事从SQL服务器中提取表的SAS作业,然后将该表加载到Oracle表中。

SQL Server中的一个字段是blob,它们可以大到1G。当我在oracle表上运行这个blob似乎被截断时,我得到长度警告,结果文件有损坏。

我见过SAS说明字符变量最大可以达到32K,但SAS也说它可以访问高达2G的blob。

我们如何实现这一目标?

proc sql;
create view work.W2K3NU8 as
  select
     ID,
     DNUMBER,
     FILENAME,
     FILE   
        format = $HEX2048.
        informat = $HEX2048.,
     (input(compress(DATEENTERED),YYMMDD10.)) as DATEENTERED length = 8
        format = date.
        informat = date.
        label = 'DATEENTERED',
     (input(compress(DATEADDED),YYMMDD10.)) as DATEADDED length = 8
        format = date.
        informat = date.
        label = 'DATEADDED',
     (input(compress(DATECHANGED),YYMMDD10.)) as DATECHANGED length = 8
        format = date.
        informat = date.
        label = 'DATECHANGED',
     TYPE
from &SYSLAST;
quit;

这里是数据步骤

      data trd.GAFILES
          (dbnull = (
                     ID = NO
                     DNUMBER = YES
                     FILENAME = YES
                     GA_FILE = YES
                     DATEENTERED = YES
                     DATAADDED = YES
                     DATECHANGED = YES
                     TYPE = YES
                     ETL_CREATE = YES
                     ETL_UPDATE = YES));
     attrib ID length = $255
        format = $255.
        informat = $255.
        label = 'ID'; 
     attrib DNUMBER length = $10
        format = $10.
        informat = $10.
        label = 'DNUMBER'; 
     attrib FILENAME length = $255
        format = $255.
        informat = $255.
        label = 'FILENAME'; 
     attrib GA_FILE length = $4096
        format = $HEX2048.
        informat = $HEX2048.
        label = 'GA_FILE'; 
     attrib DATEENTERED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATEENTERED'; 
     attrib DATAADDED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATAADDED'; 
     attrib DATECHANGED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATECHANGED'; 
     attrib TYPE length = $100
        format = $100.
        informat = $100.
        label = 'TYPE'; 
     attrib ETL_CREATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_CREATE'; 
     attrib ETL_UPDATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_UPDATE'; 
     call missing(of _all_);
     stop;
  run;

1 个答案:

答案 0 :(得分:2)

SAS数据集不支持> 32767字符大小。我不确定你在哪里看到它支持更大;您可能正在阅读ACCESS参考,该参考描述了DBMS的不同数据类型(即,在DB2部分中,它将BLOB和CLOB描述为允许最大2GB的大小,但这描述了DB2支持的内容 - 而不是SAS的支持)。

SAS很乐意访问 BLOB,但它不会超过32767。您必须读取块,或者在传递会话中使用特定于DBMS的语言(必须通过它而不触及它)。您可以像这样读取块(填写适当的子字符串函数和连接信息):

proc sql;
connect to <>;
create table SASTBL as 
  select * from connection to <> (
   select substring_Function(blobfield,1,32767) as blob_1,
          substring_Function(blobfield,32768,32767) as blob_2,
          substring_Function(blobfield,65535,32767) as blob_3,
(... etc ... )
  from your_tbl;
);
quit;

如果你有9.4,你也可以使用FedSQL进行转换;我对FedSQL不太熟悉,但它的目的是支持比SAS支持的更多数据类型。它没有明确地说它可以支持BLOB(BLOB上的注释一直“映射到类似的数据类型”,这可能意味着char或varchar)但如果你有9.4,它可能值得一试。