我正在尝试找到一种使用SAS基于其他多个观察结果创建观察结果的方法。
例如,我有下表:
+------+--------------------+-------------------+
| ID | START_DATE | END_DATE |
+------+--------------------+-------------------+
| ABC1 | 01FEB201500:00:00 | 30NOV201600:00:00 |
| ABC1 | 01JAN201700:00:00 | 30NOV201800:00:00 |
+------+--------------------+-------------------+
我想创建一个表格,其中涵盖从01JAN2014到31DEC2020的所有时间戳。换句话说,它将包括为数据集再创建2个观察值,看起来像这样;
+------+--------------------+-------------------+
| ID | START_DATE | END_DATE |
+------+--------------------+-------------------+
| ABC1 | 01FEB201400:00:00 | 31JAN201500:00:00 |
| ABC1 | 01FEB201500:00:00 | 30NOV201600:00:00 |
| ABC1 | 01DEC201600:00:00 | 30NOV201800:00:00 |
| ABC1 | 01DEC201800:00:00 | 31DEC202000:00:00 |
+------+--------------------+-------------------+
重新创建此示例的SAS代码为:
DATA test;
INPUT ID :$4. START_DATE :datetime18. END_DATE :datetime18.;
FORMAT START_DATE datetime20. END_DATE datetime20.;
CARDS;
ABC1 01FEB201400:00:00 31JAN201500:00:00
ABC1 01JAN201700:00:00 30NOV201800:00:00
;
RUN;
我在SAS中看不到做到这一点的方法
答案 0 :(得分:0)
您可以使用基本比较,一些保持变量和一个保留变量填充(或计算)范围内的差距。
示例:
假定没有范围重叠,并且从低位开始先行。
data have;
input id x1 x2; datalines;
1 3 7
1 11 14
2 4 9
2 15 18
3 1 11
4 11 20
5 1 2
5 3 4
5 5 9
5 10 20
;
data want;
set have;
by id;
length type $6;
* fill in ranges for every integer 1 through 20;
if first.id then do;
bot = 1;
retain bot;
end;
if bot < x1 then do;
hold1 = x1;
hold2 = x2;
x1 = bot;
x2 = hold1 - 1;
type = 'gap -';
output;
x1 = hold1;
x2 = hold2;
type = 'have';
bot = x2 + 1;
output;
end;
else if x1 <= bot <= x2 then do;
bot = x2 + 1;
type = 'have';
output;
end;
if last.id and 20 >= bot > x2 then do;
type = 'gap +';
x1 = bot;
x2 = 20;
output;
end;
keep type id x1 x2 bot;
run;