我有.csv和换行符,想要导入到SAS,但我面临的问题是数据有CUSTOMER和空格(换行文本)。请帮助我如何克服这个问题,类似的方式我有一些其他变量,如果我导入mannualy其工作正常。请找到下面的例子。请参阅SLN PJ0136以了解问题。
SLN MOD PM NE CUSTOMER
32121 GG 1 1 AVAILABLE UPON REQUEST
71403 EN 1 0 JET SUPPORT SERVICE INC.
305173 EN 1 1 UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC.
PJ0136 PS 0 0 "UNKNOWN / GROUP B-50 INC AA
TC0004 anada CSC Europe
Inglewood Ava"
EB0162 RG 0 0 ATR
我用infile导入
DATA WORK.test1;
%let _EFIERR_ = 0;
INFILE 'C:\Users\26631.IELPWC\Downloads\test.csv'
delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
INFORMAT
SLN $CHAR6. MOD $CHAR2. PM BEST1. NE BEST1. CUSTOMER $CHAR82. ;
FORMAT
SLN $CHAR6. MOD $CHAR2. PM BEST1. NE BEST1. CUSTOMER $CHAR82. ;
INPUT
SLN $ MOD $ PM NE CUSTOMER $ ;
if _ERROR_ then call symputx('_EFIERR_',1);
RUN;
请查看输出错误
32121 GG 1 1 AVAILABLE UPON REQUEST
71403 EN 1 0 JET SUPPORT SERVICE INC.
305173 EN 1 1 UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC.
PJ0136 PS 0 0 "UNKNOWN / GROUP B-50 INC AA
TC0004 . .
24719 . .
" . .
EB0162 RG 0 0 ATR
答案 0 :(得分:1)
假设您的输入数据采用以下格式:
SLN,MOD,PM,NE,CUSTOMER
32121,GG,1,1,AVAILABLE UPON REQUEST
71403,EN,1,0,JET SUPPORT SERVICE INC.
305173,EN,1,1,"UNKNOWN / COTTONWOOD, LLC / J SUPPORT SERVICE, INC."
PJ0136,PS,0,0,"UNKNOWN / GROUP B-50 INC AA
TC0004 anada CSC Europe
Inglewood Ava"
EB0162,RG,0,0,ATR
以下SAS代码将生成所需的输出:
data TEST (drop=_TMP_:);
format SLN $6. MOD $2. PM 8. NE 8. CUSTOMER $82. _TMP_STR $100.;
infile 'input.csv' truncover firstobs=2 dlm=',' dsd lrecl=10000;
input SLN MOD PM NE _TMP_STR @;
_TMP_COUNT=0;
do until(mod(_TMP_COUNT, 2) = 0);
CUSTOMER=catx('0A'x, CUSTOMER, _TMP_STR);
_TMP_COUNT=_TMP_COUNT + countc(_TMP_STR, '"');
if mod(_TMP_COUNT, 2) then do;
input _TMP_STR;
end;
end;
CUSTOMER=dequote(CUSTOMER);
run;
请注意 CUSTOMER 列的值,其中SLN='PJ0136'
是多行(Unix样式)。您可以通过更改函数catx(...)
来删除它。