我有一个要求,我需要根据rowCount输出组结果。
以下是我从SQL获得的结果集:
ID Date Count
1 10/01/2013 50
1 10/02/2013 25
1 10/03/2013 100
1 10/04/2013 200
1 10/05/2013 175
1 10/06/2013 45
2 10/01/2013 85
2 10/02/2013 100
我可以将它们作为
id date Count
1 10/03/2013 175
1 10/04/2013 200
1 10/05/2013 175
1 10/06/2013 45
2 10/02/2013 185
我需要通过将每个 ID 的计数< = 200 分组来减少结果集。例如,10 / 01,10 / 02和10/03的总和达到175,所以我需要将它们分成一行。如果添加值fir 10/05和10/06将是> 200,那么将它们保持未分组。
Oracle 11g中是否可以使用PLSQL或SQL Analytic函数解决此问题?
请求新REsult 有没有办法将带有附加列的结果返回给它? StartD列对于每一行,它必须采用该
的上一个结束日期ID StartD EndDate Count
1 10/01/2013 10/03/2013 175
1 10/03/2013 10/04/2013 200
1 10/04/2013 10/05/2013 250
1 10/05/2013 10/06/2013 190
1 10/06/2013 10/08/2013 45
2 10/01/2013 10/01/2013 185
答案 0 :(得分:3)
您可以使用MATCH_RECOGNIZE
模式匹配技术在Oracle 12c中执行此操作。
设置(添加了几行,包括一些计数大于200的行,用于测试):
create table stuff (id int, stamp date, num int);
insert into stuff values (1, to_date('10/01/2013', 'MM/DD/RRRR'), 50);
insert into stuff values (1, to_date('10/02/2013', 'MM/DD/RRRR'), 25);
insert into stuff values (1, to_date('10/03/2013', 'MM/DD/RRRR'), 100);
insert into stuff values (1, to_date('10/04/2013', 'MM/DD/RRRR'), 200);
insert into stuff values (1, to_date('10/05/2013', 'MM/DD/RRRR'), 250);
insert into stuff values (1, to_date('10/06/2013', 'MM/DD/RRRR'), 175);
insert into stuff values (1, to_date('10/07/2013', 'MM/DD/RRRR'), 15);
insert into stuff values (1, to_date('10/08/2013', 'MM/DD/RRRR'), 45);
insert into stuff values (2, to_date('10/01/2013', 'MM/DD/RRRR'), 85);
insert into stuff values (2, to_date('10/02/2013', 'MM/DD/RRRR'), 100);
commit;
查询将是:
select id, first_stamp, last_stamp, partial_sum
from stuff
match_recognize (
partition by id order by stamp
measures
first(a.stamp) as first_stamp
, last(a.stamp) as last_stamp
, sum(a.num) as partial_sum
pattern (A+)
define A as (sum(a.num) <= 200 or (count(*) = 1 and a.num > 200))
);
给出了:
ID FIRST_STAMP LAST_STAMP PARTIAL_SUM
---------- ----------- ---------- -----------
1 01-OCT-13 03-OCT-13 175
1 04-OCT-13 04-OCT-13 200
1 05-OCT-13 05-OCT-13 250
1 06-OCT-13 07-OCT-13 190
1 08-OCT-13 08-OCT-13 45
2 01-OCT-13 02-OCT-13 185
6 rows selected
这是如何运作的:
id
分区并按时间戳排序。A+
表示我们需要连续的组(根据分区和order by子句)满足条件A
的行。A
是集合满足:
measures
子句指示匹配返回的内容(在分区键的顶部):
这是一种具有表值函数的方法,该函数应该在11g(我认为10g)中工作。相当不优雅,但做的工作。按顺序遍历表格,只要它们“满”就输出组。
您也可以为组大小添加参数。
create or replace
type my_row is object (id int, stamp date, num int);
create or replace
type my_tab as table of my_row;
create or replace
function custom_stuff_groups
return my_tab pipelined
as
cur_sum number;
cur_id number;
cur_dt date;
begin
cur_sum := null;
cur_id := null;
cur_dt := null;
for x in (select id, stamp, num from stuff order by id, stamp)
loop
if (cur_sum is null) then
-- very first row
cur_id := x.id;
cur_sum := x.num;
elsif (cur_id != x.id) then
-- changed ID, so output last line for previous id and reset
pipe row(my_row(cur_id, cur_dt, cur_sum));
cur_id := x.id;
cur_sum := x.num;
elsif (cur_sum + x.num > 200) then
-- same id, sum overflows.
pipe row(my_row(cur_id, cur_dt, cur_sum));
cur_sum := x.num;
else
-- same id, sum still below 200
cur_sum := cur_sum + x.num;
end if;
cur_dt := x.stamp;
end loop;
if (cur_sum is not null) then
-- output the last line, if any
pipe row(my_row(cur_id, cur_dt, cur_sum));
end if;
end;
用作:
select * from table(custom_stuff_groups());
答案 1 :(得分:2)
这将根据您的示例数据返回预期结果。我不是百分百肯定,但它是否适用于所有情况(并且它可能不会非常有效):
with summed_values as (
select stuff.*,
case
when sum(cnt) over (partition by id order by count_date) >= 200 then 1
else 0
end as sum_group
from stuff
), totals as (
select id,
max(count_date) as last_count,
sum(cnt) as total_count
from summed_values
where sum_group = 0
group by id
union all
select id,
count_date as last_count,
sum(cnt) as total_count
from summed_values
where sum_group = 1
group by id, count_date
)
select *
from totals
order by id, last_count
;
SQLFiddle示例:http://sqlfiddle.com/#!4/4e0d8/1
答案 2 :(得分:1)
对于此类任务,您可以使用pipelined table function生成所需的结果。
有一点&#34;管道&#34;因为它需要定义一些其他类型,但函数本身是一个简单的游标循环,累积值并在id
更改时生成行,或者当累计总数超过限制时生成行。
你可以用很多方法实现。在这里,使用普通的旧循环,而不是 for in cursor ,我获得的东西不是那么不优雅:
CREATE OR REPLACE TYPE stuff_row AS OBJECT (
id int,
stamp date,
last_stamp date,
num int
);
CREATE OR REPLACE TYPE stuff_tbl AS TABLE OF stuff_row;
CREATE OR REPLACE FUNCTION partition_by_200
RETURN stuff_tbl PIPELINED
AS
CURSOR data IS SELECT id, stamp, num FROM stuff ORDER BY id, stamp;
curr data%ROWTYPE;
acc stuff_row := stuff_row(NULL,NULL,NULL,NULL);
BEGIN
OPEN data;
FETCH data INTO acc.id,acc.stamp,acc.num;
acc.last_stamp := acc.stamp;
IF data%FOUND THEN
LOOP
FETCH data INTO curr;
IF data%NOTFOUND OR curr.id <> acc.id OR acc.num+curr.num > 200
THEN
PIPE ROW(stuff_row(acc.id,acc.stamp,acc.last_stamp,acc.num));
EXIT WHEN data%NOTFOUND;
-- reset the accumulator
acc := stuff_row(curr.id, curr.stamp, curr.stamp, curr.num);
ELSE
-- accumulate value
acc.num := acc.num + curr.num;
acc.last_stamp := curr.stamp;
END IF;
END LOOP;
END IF;
CLOSE data;
END;
用法:
SELECT * FROM TABLE(partition_by_200());
在own answer中使用与Mat相同的测试数据,这会产生:
ID STAMP LAST_STAMP NUM
1 10/01/2013 10/03/2013 175
1 10/04/2013 10/04/2013 200
1 10/05/2013 10/05/2013 250
1 10/06/2013 10/07/2013 190
1 10/08/2013 10/08/2013 45
2 10/01/2013 10/02/2013 185