我有分层数据,使用DATE_FROM
和DATE_TO
链接实体的实例。
请参阅sqlfiddle。
使用CONNECT_BY
我可以确定每个实体的连续实例的数量,即"岛的长度",这主要是我想要的。例如,这给出了2014年DATE_FROM
每个实体的预期岛屿长度:
-- QUERY 1
SELECT
T.ENTITY_ID,
MAX(LEVEL) MAX_LEVEL
FROM TEST T
WHERE EXTRACT(YEAR FROM T.DATE_FROM) = 2014
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_FROM = PRIOR T.DATE_TO
GROUP BY T.ENTITY_ID
但是,我想要做的是计算DATE_FROM
和DATE_TO
跨越最小天数的岛屿中的行数。当我这样做时,我不想打破岛屿的等级。
所以我试过这个,但这是错的。结果并不总是我想要的。
-- QUERY 2
SELECT
T.ENTITY_ID,
MAX(LEVEL) MAX_LEVEL,
SUM(
CASE WHEN PRIOR T.DATE_TO - PRIOR T.DATE_FROM > 183
THEN 1
ELSE 0
END
) LONG_TERM_COUNT
FROM TEST T
WHERE EXTRACT(YEAR FROM T.DATE_FROM) = 2014
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_FROM = PRIOR T.DATE_TO
GROUP BY T.ENTITY_ID
哪个给出了
+-----------+-----------+-----------------+
| ENTITY_ID | MAX_LEVEL | LONG_TERM_COUNT |
+-----------+-----------+-----------------+
| 1 | 4 | 3 |
| 2 | 5 | 4 |
+-----------+-----------+-----------------+
但我正在寻找
+-----------+-----------+-----------------+
| ENTITY_ID | MAX_LEVEL | LONG_TERM_COUNT |
+-----------+-----------+-----------------+
| 1 | 4 | 4 |
| 2 | 5 | 4 |
+-----------+-----------+-----------------+
我需要一个Oracle解决方案。谢谢你的阅读。
答案 0 :(得分:1)
在CONNECT BY之后评估WHERE条件,因此您的查询不会从2014年的行开始。它为表中的每一行创建层次结构,当您删除WHERE时,您可以轻松看到聚合:
SELECT
T.ENTITY_ID,
LEVEL,
T.DATE_TO,
T.DATE_FROM,
prior T.DATE_TO,
prior T.DATE_FROM
FROM TEST T
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_TO = PRIOR T.DATE_FROM
order by 1,2
您需要使用START WITH而不是WHERE条件:
SELECT
T.ENTITY_ID,
LEVEL,
T.DATE_TO,
T.DATE_FROM,
prior T.DATE_TO,
prior T.DATE_FROM
FROM TEST T
START WITH EXTRACT(YEAR FROM T.DATE_FROM) = 2014
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_TO = PRIOR T.DATE_FROM
所以最后它:
SELECT
T.ENTITY_ID,
MAX(LEVEL) MAX_LEVEL, -- or COUNT(*)
SUM(
CASE WHEN T.DATE_TO - T.DATE_FROM > 183
THEN 1
ELSE 0
END
) LONG_TERM_COUNT
FROM TEST T
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_TO = PRIOR T.DATE_FROM
START WITH EXTRACT(YEAR FROM T.DATE_FROM) = 2014
GROUP BY T.ENTITY_ID
如果2014年有两行,你可能会得到错误的结果,所以你需要从2014年的最新一行开始:
SELECT
T.ENTITY_ID,
MAX(LEVEL) MAX_LEVEL,
SUM(
CASE WHEN T.DATE_TO - T.DATE_FROM > 183
THEN 1
ELSE 0
END
) LONG_TERM_COUNT
FROM TEST T
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_TO = PRIOR T.DATE_FROM
START WITH T.DATE_FROM =
(
SELECT MAX(T2.DATE_FROM)
FROM TEST T2
WHERE T.ENTITY_ID = T2.ENTITY_ID
AND T2.DATE_FROM >= DATE '2014-01-01'
AND T2.DATE_FROM <= DATE '2014-12-31'
)
GROUP BY T.ENTITY_ID
答案 1 :(得分:0)
你的sql语句是正确的。但是,CASE WHEN T.DATE_TO - PRIOR T.DATE_FROM > 183
语句变为null
时需要考虑的一种情况不会被计算在内。
INSERT INTO TEST
VALUES (1,TO_DATE('20130101','YYYYMMDD'),TO_DATE('20140101','YYYYMMDD'));
INSERT INTO TEST
VALUES (1,TO_DATE('20140101','YYYYMMDD'),TO_DATE('20150101','YYYYMMDD'));
从您的数据示例中,等效案例
CASE WHEN
TO_DATE('20140101','YYYYMMDD') - PRIOR TO_DATE('20140101','YYYYMMDD') > 183
这会给出null
值;
答案 2 :(得分:0)
我不熟悉Oracle,但一个好方法可能是使用RANK
聚合。
例如:
SELECT
T.ENTITY_ID,
T.DATE_FROM,
RANK() OVER (PARTITION BY ENTITY_ID
ORDER BY T.DATE_TO DESC) "Rank"
FROM TEST T
WHERE EXTRACT(YEAR FROM T.DATE_FROM) <= 2014
加入T.ENTITY_ID = Prior T.ENTITY_ID
和Rank = (PRIOR.Rank + 1)
可能会导致解决方案。正如我所说,这只是一个如何接近的建议。
我尝试了一点,这是我使用SubQuery SQL Fiddle
的解决方案SELECT
T.ENTITY_ID,
MAX(LEVEL) MAX_LEVEL,
(Select MAX("Rank") FROM
(
SELECT T2.ENTITY_ID AS ID, RANK() OVER (PARTITION BY T2.ENTITY_ID
ORDER BY T2.DATE_TO DESC) "Rank"
FROM TEST T2
WHERE EXTRACT(YEAR FROM T2.DATE_FROM) < 2014
) SubQ
WHERE ID = T.ENTITY_ID
) "LONG_TERM_COUNT"
FROM TEST T
WHERE EXTRACT(YEAR FROM T.DATE_FROM) = 2014
CONNECT BY
T.ENTITY_ID = PRIOR T.ENTITY_ID
AND T.DATE_FROM = PRIOR T.DATE_TO
GROUP BY T.ENTITY_ID