如何使用SAS在我的结果中获取失踪频率

时间:2017-07-19 00:25:06

标签: sas

我是新手,我已经发布了这个问题。但我认为我没有很好地解释它。

我在SAS里面有一个DATA。 一些单元格为空[没有],在SAS输出窗口中,它们在单元格中有一个DOT。 当我运行结果时,在表的末尾,它添加MISSING FREQUENCY = 7或无论数字是什么......

如何让SAS忽略丢失频率,只使用有结果的那个...... 请参阅我的屏幕截图,代码和我的CSV:OUTPUT DATA

RESULT WITH the MISSING frequency at the bottom

/* Generated Code (IMPORT) */
/* Source File:2012_16_ChathamPed.csv */
/* Source Path: /home/cwacta0/my_courses/Week2/ACCIDENTS */
PROC IMPORT 
        DATAFILE='/home/cwacta0/my_courses/Week2/ACCIDENTS/2012_16_ChathamPed.csv' 
        OUT=imported REPLACE;
    GETNAMES=YES;
    GUESSINGROWS=32767;
RUN;

proc contents data=work.imported;
run;

libname mydata"/courses/d1406ae5ba27fe300" access=readonly;
run;

/* sorting data by location*/
PROC SORT ;
    by LocationOfimpact;
    LABEL Route="STREET NAME" Fatalities="FATALITIES" Injuries="INJURIES" 
        SeriousInjuries="SERIOUS INJURIES" LocationOfimpact="LOCATION OF IMPACT" 
        MannerOfCollision="MANNER OF COLLISION" 
        U1Factors="PRIMARY CAUSES OF ACCIDENT" 
        U1TrafficControl="TRAFFIC CONTROL SIGNS AT THE LOCATION" 
        U2Factors="SECONDARY CAUSES OF ACCIDENT" 
        U2TrafficControl="OTHER TRAFFIC CONTROL SIGNS AT THE LOCATION" 
        Light="TYPE OF LIGHTHING AT THE TIME OF THE ACCIDENT" 
        DriverAge1="AGE OF THE DRIVER" DriverAge2="AGE OF THE CYCLIST";

    /* Here I was unable to extract the  drivers age 25 or less and te drivers who disregarded stop sign. here is how I coded it;
    IF DriverAge1 LE 25;
    IF U1Factors="Failed to Yield" OR U1Factors= "Disregard Stop Sign";
    Run;

    Also, I want to remove the Missing DATA under the results. But in the data, those are just a blank cell. How do I tell SAS to disregard a blank cell and not add it to the result?
    Here is what I did and it does not work...

    if U1Factors="BLANK" Then U1Factors=".";
    Please help me figre this out...Tks

    IF U1Factors="." Then call missing(U1Factors)*/;

Data want;
    set imported;

    IF DriverAge1 LE 25 And U1Factors in ("Failed to Yield", "Wrong Side of Road", 
        "Inattentive");

    IF Light in ("DarkLighted", "DarkNot Lighted", "Dawn");
run;

proc freq ;
    tables /*Route Fatalities Injuries SeriousInjuries LocationOfimpact MannerOfCollision*/
    U1Factors /*U1TrafficControl U2Factors U2TrafficControl*/
    light DriverAge1 DriverAge2;
RUN;

1 个答案:

答案 0 :(得分:0)

SAS将使用句点显示缺少的数字变量。因此,如果CSV文件中的DriverAge1列中没有任何内容,那么该观察将具有缺失值。如果您的变量是字符,那么SAS通常也会将输入流中只有一个句点的值转换为SAS变量中的空白。

缺少数字值被认为小于任何实数。因此,如果您希望使用小于或等于的条件,那么如果您不将其排除在某些其他条件下,则会包含缺失值。

您可以在procs上使用WHERE语句来过滤数据。如果要在单独的语句中附加WHERE条件,可以使用WHERE ALSO语法添加额外条件。

如果希望缺少的类别出现在PROC FREQ输出中,请将MISSPRINT选项添加到TABLES语句中。或者添加MISSING选项,它将出现并计入统计数据。

proc freq ;
  where . < DriverAge1 <= 25
    and U1Factors in ("Failed to Yield", "Wrong Side of Road","Inattentive")
  ;
  where also Light in ("DarkLighted", "DarkNot Lighted", "Dawn");
  tables U1Factors light DriverAge1 DriverAge2 / missing;
run;

WHERE条件将应用于整个数据集。因此,如果您排除缺少的DriverAge1并缺少U1Factors

proc freq ;
  where not missing(U1Factors) and not missing(DriverAge1);
  tables U1Factors DriverAge1 ;
run;

然后只包括两者都没有遗漏的观察结果。因此,您可能希望为每个变量单独生成统计信息。

proc freq ;
  where not missing(U1Factors);
  tables U1Factors ;
run;
proc freq ;
  where not missing(DriverAge1);
  tables DriverAge1 ;
run;