Question

我有一个数据集，其中包含一组用户的注册日期和结束日期。我想以编程方式找出每个用户在这些日期之间的几个月，而不必在任何月份进行硬编码等。我只想要每个月注册的数字摘要，所以如果这样做更快，那么多更好。

E.g。我有类似

的东西

User-+-From-------+-To-----------------
A    + 11JAN2011  + 15MAR2011
A    + 16JUN2011  + 17AUG2011
B    + 10FEB2011  + 12FEB2011
C    + 01AUG2011  + 05AUG2011

我想要像

这样的东西

Month---+-Registrations
JAN2011 + 1 (A)
FEB2011 + 2 (AB)
MAR2011 + 1 (A)
APR2011 + 0
MAY2011 + 0
JUN2011 + 1 (A)
JUL2011 + 1 (A)
AUG2011 + 2 (AC)

注意我不需要括号中的位;这只是为了试图澄清我的观点。

感谢您的帮助。

Answer 1

一种简单的方法是构建一个中间数据集，然后构建PROC FREQ。

data have;
informat from to DATE9.;
format from to DATE9.;
input user $ from to;
datalines;
A     11JAN2011   15MAR2011
A     16JUN2011   17AUG2011
B     10FEB2011   12FEB2011
C     01AUG2011   05AUG2011
;;;;
run;

data int;
set have;
_mths=intck('month',from,to,'d');  *number of months after the current one (0=current one). 'd'=discrete=count 1st of month as new month;
do _i = 0 to _mths; *start with current month, iterate over months;
  month = intnx('month',from,_i,'b');
  output;
end;
format month MONYY7.;
run;

proc freq data=int;
tables month/out=want(keep=month count rename=count=registrations);
run;

您可以通过在do循环中执行此操作来消除_mths步骤。

哪个月份包含在日期范围内？

1 个答案: