为SAS中的单列表生成直方图

时间:2019-03-05 10:43:16

标签: sas histogram

我有一个只有一列的表,我希望根据该表的列生成一个直方图。

age
---
22 
33
40
74

ods graphics / reset width=6.4in height=4.8in imagemap;

proc sgplot data=WORK.COMBINE;
    title height=14pt "Displaying maximum";
    histogram age / showbins;
    density age;
    density age / type=Kernel;

run;

ods graphics / reset; title;

我面临的问题是它没有在对应的y轴上显示数字。尽管我只有一列,但我想显示最高数字的直方图,同时每个值在Y轴上保持其对应关系,但令我惊讶的是,即使最高值也比最低值短

1 个答案:

答案 0 :(得分:2)

只有这4个数据值,直方图如下所示:

enter image description here

绘图例程会计算垃圾箱的年龄范围以及垃圾箱中心所在的位置。计算是一些内部算法,您可以使用histogram语句选项/ binstart= binwidth= nbins=对其进行控制。

当然,直方图条适用于垃圾箱,高度会根据落入垃圾箱中的值的相对数量进行缩放。 y轴将是实际计数或计数百分比。您有4个值落入3个bin中,因此其中一个bin的计数为2(或50%= 2/4)。最大值栏比最小值栏短,因为高值栏少于低值栏。

当您拥有更多数据时会发生什么?

这里有一些代码可以创建250个符合正态分布的值并对其进行直方图显示,还显示了累积频率针图。

data work.have;
  do personid = 1 to 250;
    do until (18 <= age <= 60);
      age = floor(18 + (32 + sqrt(62) * rannor(123)));
    end;
    output;
  end;
run;

proc freq noprint data=have;
  table age / out=freq outcum;  * data for needle plot;
run;

proc sgplot data=have;
    title height=14pt "Default bins";
    histogram age / showbins;
    density age;
    density age / type=Kernel;
run;

proc sgplot data=have;
    title height=14pt "binstart=20 binwidth=2";
    histogram age / showbins binstart=20 binwidth=2;
    density age;
    density age / type=Kernel;
run;

proc sgplot data=freq;
    title height=10pt "cum_freq needle plot of data from Proc FREQ output";
    needle x=age y=cum_freq;
run;

enter image description here

更多示例代码显示nbinsxaxis的效果

ods graphics / reset width=500px height=250px imagemap;
proc sgplot data=have;
    title  height=12pt "binstart=0 nbins=25";
    title2 height=12pt "xaxis min=0 max=100";
    histogram age / showbins binstart=0 binwidth=2 nbins=50;
    density age;
    density age / type=Kernel;
    xaxis min=0 max=100;
run;

proc sgplot data=have;
    title  height=12pt "binstart=0 nbins=10";
    title2 height=12pt "xaxis min=-100 max=200";
    histogram age / showbins binstart=0 binwidth=2 nbins=50;
    density age;
    density age / type=Kernel;
    xaxis min=-100 max=200;
run;

enter image description here

对于查看变量在不同类别组中的分布,您可能需要升级到SGPANEL:

data work.have2;
  do year = 2017, 2018;
  do group = 'Team A', 'Team B', 'Team C';
  do _n_ = 1 to 250;
    personId + 1; 
    do until (18 <= age <= 95);
      age = floor(6 + (32 + sqrt(95) * rannor(123)));
    end;
    output;
  end;
  end;
  end;
run;

ods graphics / reset;

title;
proc sgpanel data=have2;
  panelby year group / layout=lattice;
  histogram age;
  xaxis 
run;

enter image description here