为清晰起见,简单的示例数据:
PERSON_PINCODE
PERSON START_DATE END_DATE PINCODE
10023541700000 01-01-12 31-03-12 6059
10023541700000 01-01-12 31-03-12 6060
PINCODE_VALUE
PINCODE START_DATE END_DATE VAR VALUE
6059 01-04-11 30-06-11 3889 28.4
6059 01-07-11 30-09-11 3889 28.2
6059 01-10-11 31-12-11 3890 31.4
6060 01-04-11 30-06-11 3889 29.4
6060 01-07-11 30-09-11 3889 41.2
6060 01-10-11 31-12-11 3890 43.4
Output should be:
PERSON_PINCODE_VALUE
PERSON START_DATE END_DATE PINCODE VAR VALUE DIFF
10023541700000 01-01-12 31-03-12 6059 3889 28.2 90 days
10023541700000 01-01-12 31-03-12 6059 3890 31.4 1 day
10023541700000 01-01-12 31-03-12 6060 3890 41.2 90 days
10023541700000 01-01-12 31-03-12 6060 3890 43.4 1 day
获取PERSON_PINCODE_VALUE,
执行以下操作:
1) Take each row of PERSON_PINCODE and find PINCODE, START_DATE, END_DATE
2) For each PINCODE from step 1, find START_DATE, END_DATE, VAR, VALUE from PINCODE_VALUE
3) Associate all values from step 1 and step 2, on the basis of PINCODE, START_DATE and END_DATE
4) If in step 3, we do NOT get exact PINCODE, START_DATE and END_DATE for each VAR from step 2, find nearest prior START_DATE for remaining VAR
5) Associate values from step 4 with DIFF as PERSON_PINCODE.START_DATE - PINCODE_VALUE.END_DATE
答案 0 :(得分:0)
SELECT p.person,
p.start_date,
p.end_date,
p.pincode,
v.var,
MAX( v.value )
KEEP ( DENSE_RANK FIRST ORDER BY ABS( v.start_date - p.start_date ) )
AS value,
DECODE(
MIN( ABS( v.start_date - p.start_date ) ),
0, 0,
MIN( ABS( p.start_date - v.end_date ) )
KEEP ( DENSE_RANK FIRST ORDER BY ABS( p.start_date - v.start_date ) )
) AS diff
FROM Person_Pincode p
INNER JOIN
Pincode_Value v
ON ( p.pincode = v.pincode )
GROUP BY p.person,
p.start_date,
p.end_date,
p.pincode,
v.var;
答案 1 :(得分:0)
这是一种“更常规”的方法,可以使用ROW_NUMBER
with hlpr as (
select pp.PERSON, pp.START_DATE, pp.END_DATE, pp.PINCODE,
pv.START_DATE PV_START_DATE, pv.END_DATE PV_END_DATE,
pv.VALUE, pv.VAR,
pp.START_DATE - pv.END_DATE as diff,
row_number() over(partition by pp.PERSON, pp.START_DATE, pp.END_DATE, pp.PINCODE, pv.VAR
order by pp.START_DATE - pv.END_DATE) as rn
from PERSON_PINCODE pp
join PINCODE_VALUE pv on pp.PINCODE = pv.PINCODE
where pp.START_DATE >= pv.END_DATE /* prior PV only */
)
select /*+ PARALLEL(5) */ PERSON, START_DATE, END_DATE, PINCODE, VAR,VALUE, DIFF
from hlpr
where rn = 1
order by 1,2,3,4;
给出
PERSON START_DATE END_DATE PINCODE VAR VALUE DIFF
-------------- ----------------- ----------------- ---------- ---------- ---------- ----------
10023541700000 01.01.12 00:00:00 31.03.12 00:00:00 6059 3889 28,2 93
10023541700000 01.01.12 00:00:00 31.03.12 00:00:00 6059 3890 31,4 1
10023541700000 01.01.12 00:00:00 31.03.12 00:00:00 6060 3889 41,2 93
10023541700000 01.01.12 00:00:00 31.03.12 00:00:00 6060 3890 43,4 1
如果查询具有大量属性(因为排序逻辑集中在一个地方),我更喜欢这个解决方案。
仅对于大型数据集,我会使用@ MT0的GROUP BY方法,因为SORT GROUP BY
通常比WINDOW SORT
表现得更好。
这里是样本数据
create table PERSON_PINCODE as
select '10023541700000' PERSON, to_date('01-01-12','dd-mm-rr') START_DATE, to_date('31-03-12','dd-mm-rr') END_DATE, 6059 PINCODE from dual union all
select '10023541700000' PERSON, to_date('01-01-12','dd-mm-rr') START_DATE, to_date('31-03-12','dd-mm-rr') END_DATE, 6060 PINCODE from dual
;
create table PINCODE_VALUE as
select 6059 PINCODE, to_date('01-04-11','dd-mm-rr') START_DATE, to_date('30-06-11','dd-mm-rr') END_DATE, 3889 VAR, 28.4 VALUE from dual union all
select 6059 PINCODE, to_date('01-07-11','dd-mm-rr') START_DATE, to_date('30-09-11','dd-mm-rr') END_DATE, 3889 VAR, 28.2 VALUE from dual union all
select 6059 PINCODE, to_date('01-10-11','dd-mm-rr') START_DATE, to_date('31-12-11','dd-mm-rr') END_DATE, 3890 VAR, 31.4 VALUE from dual union all
select 6060 PINCODE, to_date('01-04-11','dd-mm-rr') START_DATE, to_date('30-06-11','dd-mm-rr') END_DATE, 3889 VAR, 29.4 VALUE from dual union all
select 6060 PINCODE, to_date('01-07-11','dd-mm-rr') START_DATE, to_date('30-09-11','dd-mm-rr') END_DATE, 3889 VAR, 41.2 VALUE from dual union all
select 6060 PINCODE, to_date('01-10-11','dd-mm-rr') START_DATE, to_date('31-12-11','dd-mm-rr') END_DATE, 3890 VAR, 43.4 VALUE from dual;