Question

我有一个奇怪的数据集，我需要将其导入SAS，根据格式将记录拆分为两个表，并完全删除一些记录。数据结构如下：

c Comment line 1
c Comment line 2
t lines init
a 'mme006'   M   8   99   15   '111 ME - RANDOLPH ST'
  path=no
    dwt=0.01  42427  ttf=1  us1=3  us2=0
    dwt=#0   42350  ttf=1  us1=1.8  us2=0  lay=3
    dwt=>0  42352  ttf=1  us1=0.5  us2=18.13
    42349  lay=3
a 'mme007'   M   8   99   15   '111 ME - RANDOLPH ST'
  path=no
    dwt=+0  42367  ttf=1  us1=0.6  us2=0
    dwt=0.01  42368  ttf=1  us1=0.6  us2=35.63 lay=3
    dwt=#0  42369  ttf=1  us1=0.3  us2=0
    42381  lay=3

只需要保留以a，dwt或整数开头的行。

对于以a开头的行，所需的输出是这样的表，称为＆＃34; lines＆＃34;，其中包含行中的前两个非a值：

 name   | type
--------+------
 mme006 | M
 mme007 | M

对于dwt /整数行，表＆＃34; itins＆＃34;看起来像这样：

 anode | dwt  | ttf | us1 | us2   | lay
 ------+------+-----+-----+-------+-----
 42427 | 0.01 |   1 | 3.0 |  0.00 |
 42350 | #0   |   1 | 1.8 |  0.00 |   3
 42352 | >0   |   1 | 0.5 | 18.13 | 
 42349 |      |     |     |       |   3       <-- line starting with integer
 42367 | +0   |   1 | 0.6 |  0.00 |
 42368 | 0.01 |   1 | 0.6 | 35.63 |   3
 42369 | #0   |   1 | 0.3 |  0.00 |
 42381 |      |     |     |       |   3       <-- line starting with integer

我到目前为止的代码几乎就在那里，但并不完全：

data lines itins;
  infile in1 missover;
  input @1 first $1. @;
      if first in ('c','t') then delete;
      else if first='a' then do;
        input name $ type $;
        output lines; end;
      else do;
        input @1 path=$ dwt=$ anode ttf= us1= us2= us3= lay=;
        if path='no' then delete;
        output itins; end;

问题：

＆＃34; line＆＃34;表是正确的，除了我不能删除＆＃34; name＆＃34;周围的引号。值（例如'mme006'）
在＆＃34; itins＆＃34; table，＆＃34; ttf＆＃34;，＆＃34; us1＆＃34;，＆＃34; us2＆＃34;正在填充正确。然而，＆＃34;阳极＆＃34;和＆＃34;躺着＆＃34;永远是空的，＆＃34; dwt＆＃34;其值为#0 4236和0.01 42，总长度为8个字符，借用应所属的部分＆＃34;阳极＆＃34;。

我做错了什么？

Answer 1

DEQUOTE（）将删除匹配的引号。

dwt的问题在于，您需要告诉它使用哪些信息;所以如果dwt是四个长，:$4.而不是$。

然而，阳极是一个问题。我想出的解决方案是：

data lines itins;
  infile in1 missover;
  input @1 first $1. @;
      if first in ('c','t') then delete;
      else if first='a' then do;
        input name $ type $;
        output lines; end;
      else do;
        input @1 path= $ @;
        if path='no' then delete;
        else do;
            if substr(_infile_,5,1)='d' then do;
                input dwt= :$12. ttf= us1= us2= us3= lay=;
                anode=input(scan(dwt,2,' '),best.);
                dwt=scan(dwt,1,' ');
                output itins; 
            end;
            else do;
                input @5 anode 5. lay=;
                output itins;
            end;
        end;
    end;

run;

基本上，先检查计划;然后，如果它不是计划行，请检查dwt中的'd'。如果存在，请在这样的行中读取，将阳极结合到dwt中，然后将其拆分。如果它不存在，只需读入阳极并放置。

如果dwt的宽度不是2-4，那么它可能需要更短，那么这可能不会起作用，你必须明确地弄清楚阳极的位置才能正确读取它。 / p>

如何在SAS中读取这种多格式数据？

1 个答案: