我有一张表,显示人的姓名,出生日期和死亡日期(1900-2000年)。我需要知道在一定时期内每年的人数,例如1940年的人口为23亿,1941年为24亿,1942年为22亿,依此类推,直到1950年。
我在《 SAS企业指南》中工作,也许代码看起来与普通sql有所不同。至少我想看到这样的东西:
〜 人数|年
2.300.000.000 | 1940 2.400.000.000 | 1941 .....................
select
count(name),
from db
where bd<1jan1940 and dd>=1jan1940 and dd=<31dec1940
group by month
答案 0 :(得分:0)
首先,您必须知道1899年底的初始人口。比方说,这是20亿。然后将每年的出生数减去死亡数相加。 (为此,您必须访问该表两次,一次用于出生,一次用于死亡。)使用SUM OVER
获得运行总计。
我不确定您实际使用的是哪个DBMS,但这是非常标准的SQL:
select yr, 2000000000 + sum(births.cnt - deaths.cnt) over (order by yr)
from
(
select extract(year from bd) as yr, count(*) as cnt
from db
group by extract(year from bd)
) births
join
(
select extract(year from dd) as yr, count(*) as cnt
from db
group by extract(year from dd)
) deaths using (yr)
order by yr;
答案 1 :(得分:0)
data dob_data;
do i = 1 to 10000;
num = ceil(rand('UNIFORM',0,10));
dob = intnx('day','01JAN1899'd,ceil(rand('UNIFORM',1,36865)));
select (num);
when (1) dod = intnx('day',dob,ceil(rand('UNIFORM',1,36865)));
otherwise dod = .;
end;
output;
end;
format dob dod date9.;
drop num;
run;
data calendar;
do i=0 to 100;
year = 1900+i;
soy = intnx('year','01JAN1900'd,i,'s');
eoy = intnx('year','01JAN1900'd,i,'e');
output;
end;
format soy eoy date9.;
run;
proc sql;
create table pop as
select year,
sum(case when DOB < soy and coalesce(DOD,'31DEC2200'd) ge soy then 1 else 0 end) as Alive_At_Start,
sum(case when DOB between soy and eoy then 1 else 0 end) as Born_During,
sum(case when coalesce(DOD,'31DEC2200'd) between soy and eoy then -1 else 0 end) as Passed,
sum(case when DOB le eoy and coalesce(DOD,'31DEC2200'd) > eoy then 1 else 0 end) as Alive_At_End
from dob_data t1, calendar t2
group by year;
quit;