我几天来一直在努力解决这个问题,现在我转向群众寻求帮助。
我的问题与此网站上的先前解决方案类似,但不完全相同: PL/SQL Split, separate a date into new dates according to black out dates! 这个解决方案相当布尔(包括/排除),而我的问题涉及其中的一些以及合并。
虽然我认为我对SQL + PL / SQL有中级/高级的掌握...但Oracle Analytic函数显然让我感到困惑。我一直在努力阅读/学习,但我已经没时间了。
由于我不确定共享表名(COTS),业务线等的合法性,我将用模糊的场景/上下文来模仿我的问题。希望这能够抵御律师的精神。
关于问题: 我有一张包含客户活动历史记录的表格。客户可以来来去去,因此我们可能在此表中有多行(每位客户)。
CREATE TABLE activity AS
SELECT 1 AS cust_id,
TO_DATE('01-JAN-2010') AS start_dt,
TO_DATE('31-JUL-2010') AS end_dt,
'EAST' AS region
FROM DUAL
UNION
SELECT 1 AS cust_id,
TO_DATE('01-FEB-2011') AS start_dt,
TO_DATE('31-DEC-2011') AS end_dt,
'EAST' AS region
FROM DUAL;
我还有一个表格,其中包含跨度的属性信息。客户可以同时拥有多个属性类型,每种类型可以多次使用不同的时间跨度。
CREATE TABLE attrib AS
SELECT 1 AS cust_id,
'POWER' AS atb_cd,
TO_DATE('01-JAN-2009') AS atb_start_dt,
TO_DATE('31-JAN-2010') AS atb_end_dt,
'LocalNuke' AS provider,
1.80 AS per_kwh,
0 AS per_gal
FROM DUAL
UNION
SELECT 1 AS cust_id,
'POWER' AS atb_cd,
TO_DATE('01-MAR-2010') AS atb_start_dt,
TO_DATE('31-MAR-2010') AS atb_end_dt,
'CoalGuys' AS provider,
1.60 AS per_kwh,
0 AS per_gal
FROM DUAL
UNION
SELECT 1 AS cust_id,
'POWER' AS atb_cd,
TO_DATE('01-JUN-2010') AS atb_start_dt,
TO_DATE('30-SEP-2010') AS atb_end_dt,
'LocalNuke' AS provider,
1.70 AS per_kwh,
0 AS per_gal
FROM DUAL
UNION
SELECT 1 AS cust_id,
'POWER' AS atb_cd,
TO_DATE('01-MAR-2011') AS atb_start_dt,
TO_DATE('31-DEC-9999') AS atb_end_dt,
'GeoHeat' AS provider,
1.10 AS per_kwh,
0 AS per_gal
FROM DUAL
UNION
SELECT 1 AS cust_id,
'WATER' AS atb_cd,
TO_DATE('01-MAR-2010') AS atb_start_dt,
TO_DATE('31-DEC-9999') AS atb_end_dt,
'GlacialGold' AS provider,
0 AS per_kwh,
0.60 AS per_gal
FROM DUAL;
数据怪异是有意的,我试图将这个场景变为真实世界而不与“现实世界”相关。
结果应该限制客户与这家虚构公司的活动的跨度,并将所有重叠日期分开以形成时间表。需要将数据元素合并在一起进行报告。
目视:
Cust:
|----------------------| |------------------------|
Power:
|-------------| |--| |-------| |---------------------->
Water:
|------------------------------------------------------>
Expected Result:
|----|----|--|----|----| |----|-------------------|
解决方案应该是可扩展的,以包含其他属性。最后,我在表格中有这种非规范化信息,这样我就可以在任何时间点报告客户的数据。例如,如果他们在某一天有活动,权力和水;我应该能够导出当天的per_kwh,per_gal和活动数据。
示例输出(表格):
CUST_ID FROM_DT THRU_DT REGION POWER_PROVIDER WATER_PROVIDER PER_KWH PER_GAL
------- ----------- ----------- ------ -------------- -------------- ------- -------
1 01-JAN-2010 31-JAN-2010 EAST LocalNuke 1.80 0
1 01-FEB-2010 28-FEB-2010 EAST 0 0
1 01-MAR-2010 31-MAR-2010 EAST CoalGuys GlacialGold 1.60 0.60
1 01-APR-2010 31-MAY-2010 EAST GlacialGold 0 0.60
1 01-JUN-2010 31-JUL-2010 EAST LocalNuke GlacialGold 1.70 0.60
1 01-FEB-2011 28-FEB-2011 EAST GlacialGold 0 0.60
1 01-MAR-2011 31-DEC-2011 EAST GeoHeat GlacialGold 1.10 0.60
我在2年前(当需求类似于Activity / Power时)写了一些东西,使用2个异步游标逐慢处理(逐行)。
虽然性能很重要,但我试图找到直接/批量sql解决方案的最大原因是维护。我的原始解决方案的if / else光标嵌套已经难以遵循,并且在至少还有2个“属性”跨度要分割时会呈指数级变差。
如果您能提供任何帮助,我将不胜感激。
答案 0 :(得分:1)
这确实是一个非常棘手的问题,我希望你最终会遇到一个混乱的问题。您遇到的核心问题是您需要为attrib表中的间隙制作“psudeo”行。这是有问题的。
我采用了你的问题的缩减版本,只是试图为POWER attrib制造差距。我采用了attidute,每个attrib行之前都有一个间隙。想出了这个
SELECT PS.cust_id
, G.is_gap
, DECODE( G.is_gap, 'Y', PS.prev_start, PS.atb_start_dt ) AS start_date
, DECODE( G.is_gap, 'Y', PS.prev_end, PS.atb_end_dt ) AS end_date
, DECODE( G.is_gap, 'Y', NULL, PS.provider ) AS provider
, DECODE( G.is_gap, 'Y', NULL, PS.per_kwh ) AS per_kwh
, DECODE( G.is_gap, 'Y', NULL, PS.per_gal ) AS per_gal
FROM
( SELECT P.cust_id
, P.atb_start_dt
, P.atb_end_dt
, P.provider
, P.per_kwh
, P.per_gal
, P.atb_start_dt - 1 AS prev_end
, NVL( MAX( P.atb_end_dt ) OVER ( ORDER BY P.atb_end_dt
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING ) + 1
, '01-JAN-1900' ) AS prev_start
FROM attrib P
WHERE P.atb_cd = 'POWER'
) PS
, ( SELECT DECODE(LEVEL,1,'Y','N') AS is_gap
FROM DUAL
CONNECT BY LEVEL <= 2
) G
WHERE ( PS.prev_end > PS.prev_start
OR G.is_gap = 'N' )
ORDER BY 3
/
给我这些结果
CUST_ID I START_DATE END_DATE PROVIDER PER_KWH PER_GAL
------- - ---------- ---------- ----------- ------- -------
1 Y 01-JAN-00 31-DEC-08
1 N 01-JAN-09 31-JAN-10 LocalNuke 1.8 0
1 N 01-FEB-10 31-MAR-10 CoalGuys 1.6 0
1 Y 01-APR-10 31-MAY-10
1 N 01-JUN-10 30-SEP-10 LocalNuke 1.7 0
1 Y 01-OCT-10 28-FEB-11
1 N 01-MAR-11 31-DEC-99 GeoHeat 1.1 0
一些注意事项:
31-JUL-2010
,因为那是activity
结束的时候吗?CoalGuys
的开始日期更新为01-FEB-2010
以测试何时没有差距UNION
一个9999
作为一年,因为如果您尝试添加任何内容,就会出现错误。最终没有问题,但是如果你找到差距就是一个支柱。现在距离完整的解决方案还有很长的路要走,一旦你投入客户和水的日期,它仍然会变得更加混乱。但是您可能需要将上述内容作为内联视图包含在主查询中。然后你必须为WATER做同样的事情。然后,您必须将两者一起加入日期范围检查,然后使用LEAST
和GREATEST
作为最终日期结果。
对不起,在我花了40多分钟后,它已经从一个有问题的问题转变为感觉就像工作一样,所以要留下我的答案不完整。希望它有所帮助。
答案 1 :(得分:1)
这可能有用。它不会将连续的区域合并在一起,但它仍然可以完成工作。
WITH
milestone AS
(
SELECT cust_id, start_dt AS point_in_time FROM ACTIVITY
UNION
SELECT cust_id, atb_start_dt AS point_in_time FROM ATTRIB
UNION
SELECT cust_id, LEAST(end_dt, TO_DATE('30-DEC-9999')) + 1 AS point_in_time FROM ACTIVITY
UNION
SELECT cust_id, LEAST(atb_end_dt, TO_DATE('30-DEC-9999')) + 1 AS point_in_time FROM ATTRIB
)
SELECT
milestone.cust_id AS cust_id,
milestone.point_in_time AS from_dt,
LEAD(point_in_time)
OVER (PARTITION BY milestone.cust_id ORDER BY milestone.point_in_time) - 1
AS thru_dt,
activity.region AS region,
power_attrib.provider AS power_provider,
water_attrib.provider AS water_provider,
COALESCE(power_attrib.per_kwh, 0) AS per_kwh,
COALESCE(water_attrib.per_gal, 0) AS per_gal
FROM
MILESTONE
LEFT OUTER JOIN ACTIVITY
ON milestone.cust_id = activity.cust_id
AND milestone.point_in_time BETWEEN activity.start_dt AND activity.end_dt
LEFT OUTER JOIN ATTRIB power_attrib
ON milestone.cust_id = power_attrib.cust_id
AND power_attrib.atb_cd = 'POWER'
AND milestone.point_in_time BETWEEN power_attrib.atb_start_dt AND power_attrib.atb_end_dt
LEFT OUTER JOIN ATTRIB water_attrib
ON milestone.cust_id = water_attrib.cust_id
AND water_attrib.atb_cd = 'WATER'
AND milestone.point_in_time BETWEEN water_attrib.atb_start_dt AND water_attrib.atb_end_dt