每月在SAS中每月汇总

时间:2015-10-21 02:09:20

标签: sql sum sas aggregate

我有以下数据集:

  Date        Occupation      Count
Jan2006        Nurse            15
Jan2006        Lawyer           2
Jan2006        Mechanic         3
Feb2006        Economist        2
Feb2006        Lawyer           1
Feb2006        Nurse            5

数据一直持续到2014年12月,每个职业都有不同的职业和计数。我想要做的是按职业总计计算一年的总数。因此,假设上述数据包含所有月份和计数,我希望我的最终数据集如下所示:

Date     Occupation    Sum
2006      Nurse         20
2006      Lawyer        3
2006      Mechanic      3
2006      Economist     2
and so on until Dec2014. 

我尝试使用first.variable和last.variable如下,但它没有用。

data want,
   set have;
if first.date and first.Occupation then sum = 0;
sum+Count;
if last.date and last.occupation then output; 
run;

但这并没有给我所需的输出。我觉得这可以在SQL中轻松完成,但不熟悉SQL,我对使用它犹豫不决。

提前感谢您的帮助。

4 个答案:

答案 0 :(得分:1)

试试这个:

proc sql;
    create table want as
    select year(date) as date, occupation,sum(count) as sum from have
    group by year(date),occupation;
quit;

答案 1 :(得分:1)

由于您使用的是SAS,因此可以利用proc summary等过程按变量的格式化值进行分组。因此,如果您将year.格式应用于Date变量,那么它将自动按年分组。

data have;
input Date :monyy7. Occupation $20. Count;
format date monyy7.;
datalines;
Jan2006        Nurse            15
Jan2006        Lawyer           2
Jan2006        Mechanic         3
Feb2006        Economist        2
Feb2006        Lawyer           1
Feb2006        Nurse            5
;
run;

proc summary data=have nway;
class date occupation / order=freq; /* sort by descending sum */
format date year.; /* apply year format to date for grouping purposes */
var count;
output out=want (drop=_:) sum=;
run;

答案 2 :(得分:0)

在纯粹的datasteps和proc步骤方法中,您可以像下面这样做,

data test;
  infile datalines;
  input MonYr monyy7. Occupation $ Count;
  datalines;
Jan2006        Nurse            15
Jan2006        Lawyer           2
Jan2006        Mechanic         3
Feb2006        Economist        2
Feb2006        Lawyer           1
Feb2006        Nurse            5
;
run;

proc sort data=test;
  by Occupation MonYr Count;
run;

data result(drop=MonYr Count);
  set test;
  by Occupation MonYr Count;
  retain Sum 0;
  if first.Occupation then Sum=Count;
  else Sum=Sum+Count;

  if last.Occupation;    
  Date=Year(MonYr);
run;  

您可以先将YearMonth值更改为Year并执行排序,或者只关注上面的代码。

答案 3 :(得分:0)

    select substring([date],charindex('2',[date]),len([date])),Occupation,sum([count]) 
    from sas group by substring([date],charindex('2',[date]),len([date])),Occupation