在Infile期间用SCAN和TRANWRD替换字符

时间:2016-12-03 16:23:09

标签: sas

现在几乎有了这个,只是在我正在阅读的一些文件中得到了一个最终修复。我发布的代码是循环于类似文件的宏的一部分。较旧的原始文件有正斜杠而不是破折号,我试图在读取阶段替换没有运气。 SAS正在为这些返回缺失值:

data test;
        infile "&filename" 
        delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
        length
        EventTypes 8
        EventLabels $21
        EventID 8
        Player_ID 8
        ExpandedMinute 8
        Second 8
        TeamID 8
        EndY 8
        EndX 8
        Y 8
        X 8
        IsTouch $5
        ID 8
        Minute 8
        Period $10
        Type $25
        OutcomeType $12
        Area1 $25
        Area2 $25
        ParamVal1 $15
        ParamVal2 $15
        MatchID 8
        MatchDate 8
        HomeTeamName $100
        AwayTeamName $100
        FTScore $5
        HomeScore 8
        AwayScore 8 
        ;
        informat EventTypes best32. ;
        informat EventLabels $21. ;
        informat EventID best32. ;
        informat Player_ID best32. ;
        informat ExpandedMinute best32. ;
        informat Second best32. ;
        informat TeamID best32. ;
        informat EndY best32. ;
        informat EndX best32. ;
        informat Y best32. ;
        informat X best32. ;
        informat IsTouch $5. ;
        informat ID best32. ;
        informat Minute best32. ;
        informat Period $10. ;
        informat Type $25. ;
        informat OutcomeType $12. ;
        informat Area1 $25. ;
        informat Area2 $25. ;
        informat ParamVal1 $15. ;
        informat ParamVal2 $15. ;
        informat MatchID best32. ;
        informat MatchDate ddmmyy10. ;
        informat HomeTeamName $100. ;
        informat AwayTeamName $100. ;
        informat FTScore $5. ;
        informat HomeScore best32. ;
        informat AwayScore best32. ;
        format EventTypes best12. ;
        format EventLabels $21. ;
        format EventID best12. ;
        format Player_ID best12. ;
        format ExpandedMinute best12. ;
        format Second best12. ;
        format TeamID best12. ;
        format EndY best12. ;
        format EndX best12. ;
        format Y best12. ;
        format X best12. ;
        format IsTouch $5. ;
        format ID best12. ;
        format Minute best12. ;
        format Period $10. ;
        format Type $25. ;
        format OutcomeType $12. ;
        format Area1 $25. ;
        format Area2 $25. ;
        format ParamVal1 $15. ;
        format ParamVal2 $15. ;
        format MatchID best12. ;
        format MatchDate ddmmyy10. ;
        format HomeTeamName $100. ;
        format AwayTeamName $100. ;
        format FTScore $5. ;
        format HomeScore best12. ;
        format AwayScore best12. ;
        input
        EventTypes
        EventLabels $
        EventID
        Player_ID
        ExpandedMinute
        Second
        TeamID
        EndY
        EndX
        Y
        X
        IsTouch $
        ID
        Minute
        Period $
        Type $
        OutcomeType $
        Area1 $
        Area2 $ @;
        if scan(_infile_,20,',') not in ('Back', 'Defence', 'Forward', 'Left', 'Midfield', 'Right') then 
        input ParamVal1 @;
        else 
        input ParamVal2 @;
        input
        MatchID @;
        MatchDate = tranwrd((scan(_infile_,21,',')), "/", "-");
        input MatchDate @;
        input
        HomeTeamName $
        AwayTeamName $
        FTScore $
        HomeScore
        AwayScore;
        if ParamVal1 = '' then ParamVal1 = '0';
        if ParamVal2 = '' then ParamVal2 = 'None';
        run;

有人可以建议修改以上内容以使其发挥作用吗?

由于

修改

根据要求,这是一个生成缺失值的原始数据行:

118,shortPassAccurate,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,PassEndX,None,51.1,410988,08/14/2010,Aston Villa,West Ham,3 : 0,3,0,

作业日志如下所示:

Record: _410988_08_14_2010 processed successfully. Processing next record...
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2         0,None,2,0,0,0,24,0.0,0.0,0.0,0.0,False,1505579494.0,0,NoQual,NoQual,NoQual,NoQual,NoQual,0.0,410988
     101  ,08/14/2010,Aston Villa,West Ham,3 : 0,3,0, 143
EventTypes=0 EventLabels=None EventID=2 Player_ID=0 ExpandedMinute=0 Second=0 TeamID=24 EndY=0 EndX=0 Y=0 X=0 IsTouch=False
ID=1505579494 Minute=0 Period=NoQual Type=NoQual OutcomeType=NoQual Area1=NoQual Area2=NoQual ParamVal1=0.0 ParamVal2=None
MatchID=410988 MatchDate=. HomeTeamName=Aston Villa AwayTeamName=West Ham FTScore=3 : 0 HomeScore=3 AwayScore=0 _ERROR_=1
_INFILE_=0,None,2,0,0,0,24,0.0,0.0,0.0,0.0,False,1505579494.0,0,NoQual,NoQual,NoQual,NoQual,NoQual,0.0,410988,08/14/2010,Aston Villa,
West Ham,3 : 0,3,0, _N_=1
3         0,None,2,0,0,0,29,0.0,0.0,0.0,0.0,False,49800133.0,0,NoQual,NoQual,NoQual,NoQual,NoQual,0.0,410988,0
     101  8/14/2010,Aston Villa,West Ham,3 : 0,3,0, 141
EventTypes=0 EventLabels=None EventID=2 Player_ID=0 ExpandedMinute=0 Second=0 TeamID=29 EndY=0 EndX=0 Y=0 X=0 IsTouch=False
ID=49800133 Minute=0 Period=NoQual Type=NoQual OutcomeType=NoQual Area1=NoQual Area2=NoQual ParamVal1=0.0 ParamVal2=None
MatchID=410988 MatchDate=. HomeTeamName=Aston Villa AwayTeamName=West Ham FTScore=3 : 0 HomeScore=3 AwayScore=0 _ERROR_=1
_INFILE_=0,None,2,0,0,0,29,0.0,0.0,0.0,0.0,False,49800133.0,0,NoQual,NoQual,NoQual,NoQual,NoQual,0.0,410988,08/14/2010,Aston Villa,We
st Ham,3 : 0,3,0, _N_=2
4         90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,PassEnd
     101  Y,None,48.5,410988,08/14/2010,Aston Villa,West Ham,3 : 0,3,0, 161
EventTypes=90 EventLabels=midThird EventID=3 Player_ID=4511 ExpandedMinute=0 Second=5 TeamID=24 EndY=48.5 EndX=51.1 Y=52.2 X=49.4
IsTouch=True ID=1394118243 Minute=0 Period=FirstHalf Type=Start OutcomeType=Successful Area1=PassEndY Area2=None ParamVal1=48.5
ParamVal2=None MatchID=410988 MatchDate=. HomeTeamName=Aston Villa AwayTeamName=West Ham FTScore=3 : 0 HomeScore=3 AwayScore=0
_ERROR_=1
_INFILE_=90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,PassEndY,None,48.5,410988,08/14
/2010,Aston Villa,West Ham,3 : 0,3,0, _N_=3
5         90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,Length,
     101  None,3.1,410988,08/14/2010,Aston Villa,West Ham,3 : 0,3,0, 158
EventTypes=90 EventLabels=midThird EventID=3 Player_ID=4511 ExpandedMinute=0 Second=5 TeamID=24 EndY=48.5 EndX=51.1 Y=52.2 X=49.4
IsTouch=True ID=1394118243 Minute=0 Period=FirstHalf Type=Start OutcomeType=Successful Area1=Length Area2=None ParamVal1=3.1
ParamVal2=None MatchID=410988 MatchDate=. HomeTeamName=Aston Villa AwayTeamName=West Ham FTScore=3 : 0 HomeScore=3 AwayScore=0
_ERROR_=1
_INFILE_=90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,Length,None,3.1,410988,08/14/20
10,Aston Villa,West Ham,3 : 0,3,0, _N_=4
6         90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,Angle,N
     101  one,5.3,410988,08/14/2010,Aston Villa,West Ham,3 : 0,3,0, 157
EventTypes=90 EventLabels=midThird EventID=3 Player_ID=4511 ExpandedMinute=0 Second=5 TeamID=24 EndY=48.5 EndX=51.1 Y=52.2 X=49.4
IsTouch=True ID=1394118243 Minute=0 Period=FirstHalf Type=Start OutcomeType=Successful Area1=Angle Area2=None ParamVal1=5.3
ParamVal2=None MatchID=410988 MatchDate=. HomeTeamName=Aston Villa AwayTeamName=West Ham FTScore=3 : 0 HomeScore=3 AwayScore=0
_ERROR_=1
_INFILE_=90,midThird,3,4511,0,5,24,48.5,51.1,52.2,49.4,True,1394118243.0,0,FirstHalf,Start,Successful,Angle,None,5.3,410988,08/14/201
0,Aston Villa,West Ham,3 : 0,3,0, _N_=5

1 个答案:

答案 0 :(得分:1)

缺少值,因为没有第14个月。是否可以使用MMDDYY格式而不是DDMMYY格式的值?对于源文件的其他行怎么办?

也许您应该将该字段作为字符串读取,然后进行分析以查看这些值是否与一种日期格式一致。如果未以一致的格式输入值,则需要提供一些其他信息,以帮助确定05/05/2015中的模糊值是否为MDY格式。

我个人告诉数据供应商以YMD格式输出日期以避免这种混淆。