我有一个包含10个以上列的select语句。我必须根据日期在任何数据丢失的地方重复这些行。 要生成的行应该具有按日期升序排序的前面行的数据。 要考虑的日期范围基于id的分组。
日期实际上是从3月15日到4月16日的范围,但是对于样本我只采取了有限的行。
例如,数据如下所示。
ID Date Type Code Location
==== ====== === ==== ====
1 15-Mar TG RET X1
1 17-Mar GG CAN S2
1 20-Mar DTR ISS D2
2 14-Apr YT RR F2
2 16-Apr F FC F1
例外输出:
ID Date Type Code Location
=== ==== ==== ==== ======
1 15-Mar TG RET X1
*1 16-Mar TG RET X1*
1 17-Mar GG CAN S2
*1 18-Mar GG CAN S2*
*1 19-Mar GG CAN S2*
1 20-Mar DTR ISS D2
2 14-Apr YT RR F2
*2 15-Apr YT RR F2*
2 16-Apr F FC F1
答案 0 :(得分:2)
这是一个实现所需输出的可能方法的工作示例。我正在使用Oracle LAST_VALUE
分析函数和IGNORE NULLS
选项以及ORDER BY
子句。
测试数据:
CREATE TABLE so123 (
id NUMBER,
d DATE,
type VARCHAR2(10),
code VARCHAR2(10),
location VARCHAR2(10)
);
INSERT INTO so123 VALUES (1, DATE '2015-05-15', 'TG', 'RET', 'X1');
INSERT INTO so123 VALUES (1, DATE '2015-05-17', 'GG', 'CAN', 'S2');
INSERT INTO so123 VALUES (1, DATE '2015-05-20', 'DTR', 'ISS', 'D2');
INSERT INTO so123 VALUES (2, DATE '2015-04-14', 'YT', 'RR', 'F2');
INSERT INTO so123 VALUES (2, DATE '2015-04-16', 'F', 'FC', 'F1');
COMMIT;
选择本身:
WITH
dmm AS (
SELECT MIN(d) min_d, MAX(d) max_d FROM so123
)
SELECT
NVL(s.id, LAST_VALUE(s.id) IGNORE NULLS OVER (ORDER BY dt.d)) AS id,
dt.d,
NVL(s.type, LAST_VALUE(s.type) IGNORE NULLS OVER (ORDER BY dt.d)) AS type,
NVL(s.code, LAST_VALUE(s.code) IGNORE NULLS OVER (ORDER BY dt.d)) AS code,
NVL(s.location, LAST_VALUE(s.location) IGNORE NULLS OVER (ORDER BY dt.d)) AS location
FROM (
SELECT min_d + level - 1 as d
FROM dmm
CONNECT BY min_d + level - 1 <= max_d
) dt LEFT JOIN so123 s ON (dt.d = s.d)
ORDER BY dt.d
;
输出:
ID D TYPE CODE LOCATION
---------- ---------------- ---------- ---------- ----------
2 14-04-2015 00:00 YT RR F2
2 15-04-2015 00:00 YT RR F2
2 16-04-2015 00:00 F FC F1
2 17-04-2015 00:00 F FC F1
2 18-04-2015 00:00 F FC F1
2 19-04-2015 00:00 F FC F1
2 20-04-2015 00:00 F FC F1
2 21-04-2015 00:00 F FC F1
2 22-04-2015 00:00 F FC F1
2 23-04-2015 00:00 F FC F1
2 24-04-2015 00:00 F FC F1
2 25-04-2015 00:00 F FC F1
2 26-04-2015 00:00 F FC F1
2 27-04-2015 00:00 F FC F1
2 28-04-2015 00:00 F FC F1
2 29-04-2015 00:00 F FC F1
2 30-04-2015 00:00 F FC F1
2 01-05-2015 00:00 F FC F1
2 02-05-2015 00:00 F FC F1
2 03-05-2015 00:00 F FC F1
2 04-05-2015 00:00 F FC F1
2 05-05-2015 00:00 F FC F1
2 06-05-2015 00:00 F FC F1
2 07-05-2015 00:00 F FC F1
2 08-05-2015 00:00 F FC F1
2 09-05-2015 00:00 F FC F1
2 10-05-2015 00:00 F FC F1
2 11-05-2015 00:00 F FC F1
2 12-05-2015 00:00 F FC F1
2 13-05-2015 00:00 F FC F1
2 14-05-2015 00:00 F FC F1
1 15-05-2015 00:00 TG RET X1
1 16-05-2015 00:00 TG RET X1
1 17-05-2015 00:00 GG CAN S2
1 18-05-2015 00:00 GG CAN S2
1 19-05-2015 00:00 GG CAN S2
1 20-05-2015 00:00 DTR ISS D2
37 rows selected
这是如何工作的?我们从源表生成MIN和MAX日期之间的所有日期。为此,我们使用CONNECT BY
子句使Oracle生成记录,直到条件min_d + level - 1 <= max_d
不再持有。
然后,我们将生成的记录和LEFT JOIN
源表格带到它们。这里有LAST_VALUE
分析函数的神奇功能。此函数使用指定的顺序搜索表中的最后一个非null(IGNORE NULLS
选项)值,并填写缺少的字段。
您可以在此处详细了解该功能:
http://oracle-base.com/articles/misc/first-value-and-last-value-analytic-functions.php
答案 1 :(得分:2)
当遇到可以使用Oracle MODEL
子句以最佳方式解决的问题时,您必须担心。以下查询将返回所需结果:
SELECT id, d, type, code, location
FROM (
SELECT
id, d, type, code, location,
null min_d,
null max_d
FROM t
UNION ALL
SELECT
id, null, null, null, null,
MIN(d),
MAX(d)
FROM t
GROUP BY id
)
MODEL RETURN UPDATED ROWS
PARTITION BY (id)
DIMENSION BY (d)
MEASURES (type, code, location, min_d, max_d)
RULES (
type [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(type[cv(d)], type[cv(d) - 1]),
code [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(code[cv(d)], code[cv(d) - 1]),
location[FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(location[cv(d)], location[cv(d) - 1])
)
ORDER BY id, d
<子> SQLFiddle 子>
| ID | D | TYPE | CODE | LOCATION |
|----|-------------------------|------|------|----------|
| 1 | March, 15 2015 00:00:00 | TG | RET | X1 |
| 1 | March, 16 2015 00:00:00 | TG | RET | X1 |
| 1 | March, 17 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 18 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 19 2015 00:00:00 | GG | CAN | S2 |
| 1 | March, 20 2015 00:00:00 | DTR | ISS | D2 |
| 2 | April, 14 2015 00:00:00 | YT | RR | F2 |
| 2 | April, 15 2015 00:00:00 | YT | RR | F2 |
| 2 | April, 16 2015 00:00:00 | F | FC | F1 |
将MODEL
视为一种SQL电子表格语言,有点像Microsoft Excel,但功能更强大 - 因为SQL!
SELECT id, d, type, code, location
FROM (
-- This is your original data, plus two columns
SELECT
id, d, type, code, location,
null min_d,
null max_d
FROM t
UNION ALL
-- This is a utility record containing the MIN(d) and MAX(d) values for
-- each ID partition. We'll use these MIN / MAX values to generate rows
SELECT
id, null, null, null, null,
MIN(d),
MAX(d)
FROM t
GROUP BY id
)
-- We're using the RETURN UPDATED ROWS clause, as we don't want the utility
-- record from above in the results
MODEL RETURN UPDATED ROWS
-- Your requirement is to fill gaps between dates within each id PARTITION
PARTITION BY (id)
-- The dates are your DIMENSION, i.e. the axis along which we're generating rows
DIMENSION BY (d)
-- The remaining rows are the MEASURES, i.e. the calculated values in each "cell"
MEASURES (type, code, location, min_d, max_d)
-- The following RULES are used to generate rows. For each MEASURE, we simply
-- iterate from the MIN(d) to the MAX(d) value, referencing the min_d / max_d
-- values from the utility record above
RULES (
type [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(type[cv(d)], type[cv(d) - 1]),
code [FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(code[cv(d)], code[cv(d) - 1]),
location[FOR d FROM min_d[null] TO max_d[null] INCREMENT INTERVAL '1' DAY] =
NVL(location[cv(d)], location[cv(d) - 1])
)
ORDER BY id, d