将文本文件拆分为3个数据集/表

时间:2016-05-16 21:40:03

标签: sql sas

我在文本文件中有如下数据。

如何将文本文件分成3个数据集/表?

1包含盈利数据,第2位包含兑换数据,第3位包含到期数据。他们每个都有很多行,我刚才提到每一行只有3-4行。我正在尝试使用Infile语句但不知道如何拆分。这是一个想法:首先,将读取初始数据(earnings),并且每当sas识别单词redemptions时,它必须停止并且其余数据必须转到第二个数据集并且每当sas识别出单词Expirations,该关键字下方的数据必须转到第3个数据集。有什么建议吗?

Earnings
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
....
redemptions
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Expirations
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecd ccsdc
 djc c djc cjdcjjnc

1 个答案:

答案 0 :(得分:1)

使用retain变量可以帮助您实现这一目标。

使用以下代码,只需将datalines语句中的infile替换为文件名,然后设置正确的infile参数。

data rawImport;
  infile datalines dsd delimiter=' ' truncover;
  informat C1-C4 $32.;
  input C1-C4;
  datalines;
Earnings
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Redemptions
abc 234 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Expirations
abc 345 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecd ccsdc
djc c djc cjdcjjnc
;

通过使用retain变量,我们现在可以将这些行分派到适当的数据集。

data Earnings Redemptions Expirations;
  set rawImport;
  length outputDS $ 12;
  retain outputDS;

  * Determine output dataset;
  if C1 = "Earnings" then do;
    outputDS = "Earnings";
    delete;
  end;
  else if C1 = "Redemptions" then do;
    outputDS = "Redemptions";
    delete;
  end;
  else if C1 = "Expirations" then do;
    outputDS = "Expirations";
    delete;
  end;

  * output to appropriate dataset;
  if outputDS = "Earnings" then output Earnings;
  else if outputDS = "Redemptions" then output Redemptions;
  else if outputDS = "Expirations" then output Expirations;

  drop outputDS;
run;

现在显示日志:

NOTE: There were 13 observations read from the data set WORK.RAWIMPORT.
NOTE: The data set WORK.EARNINGS has 3 observations and 4 variables.
NOTE: The data set WORK.REDEMPTIONS has 3 observations and 4 variables.
NOTE: The data set WORK.EXPIRATIONS has 4 observations and 4 variables.