我有一个结果集,为简单起见,我将引用三列的表“tab”:Category,Subcategory和Date,按类别排序,然后按日期排序。此数据集是一个网格,我希望在该网格之上执行其他处理。我的问题是在数据集中唯一标识(或顺序标记)组。根据前3列的存在,下面的SQL是我所追求的(GID1或GID2会做)。 我尝试过group_id,grouping_id,rank,dense_rank,或者错过了其中一个技巧,或者我正在尝试一些非常尴尬的事情。 GID的顺序并不重要,但重要的是,组号分配基于订购的数据(类别,然后是日期)。
CREATE TABLE Tab
("Category" varchar2(1), "SubCategory" varchar2(7), "Date" int, "GID1" int, "GID2" int);
INSERT ALL
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'bannana', 20120101, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'grape', 20120102, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'pear', 20120103, 1, 1)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'pear', 20120104, 1, 1)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'bannana', 20120105, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'pear', 20120106, 2, 2)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'pear', 20120107, 2, 2)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'apple', 20120108, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('A', 'pear', 20120109, 3, 3)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'apple', 20120101, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'bannana', 20120102, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'apple', 20120103, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'bannana', 20120104, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'pear', 20120105, 1, 4)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'pear', 20120106, 1, 4)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'pear', 20120107, 1, 4)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'pear', 20120108, 1, 4)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('B', 'pear', 20120109, 1, 4)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'grape', 20120101, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'grape', 20120102, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'apple', 20120103, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'bannana', 20120104, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'grape', 20120105, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'pear', 20120106, 1, 5)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'apple', 20120107, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'apple', 20120108, NULL, NULL)
INTO Tab ("Category", "SubCategory", "Date", "GID1", "GID2")
VALUES ('C', 'apple', 20120109, NULL, NULL)
SELECT * FROM dual
;
答案 0 :(得分:4)
好吧,如果只是梨子那么:
SQL> select "Category", "SubCategory", "Date",
2 case
3 when "SubCategory" = 'pear'
4 then
5 count(rn) over (partition by "Category" order by "Date") else null
6 end GID1 ,
7 case
8 when "SubCategory" = 'pear'
9 then
10 count(rn) over ( order by "Category", "Date") else null
11 end GID2
12 from (select "Category", "SubCategory", "Date", lag("SubCategory") over (partition by "Category" order by "Date"),
13 case
14 when lag("SubCategory") over (partition by "Category" order by "Date") != "SubCategory"
15 and "SubCategory" = 'pear'
16 then 1
17 when row_number() over (partition by "Category" order by "Date") = 1 and "SubCategory" = 'pear' then 1
18 else null
19 end rn
20 from tab)
21 order by 1, 3;
Category SubCate Date GID1 GID2
---------- ------- ---------- ---------- ----------
A bannana 20120101
A grape 20120102
A pear 20120103 1 1
A pear 20120104 1 1
A bannana 20120105
A pear 20120106 2 2
A pear 20120107 2 2
A apple 20120108
A pear 20120109 3 3
B apple 20120101
B bannana 20120102
B apple 20120103
B bannana 20120104
B pear 20120105 1 4
B pear 20120106 1 4
B pear 20120107 1 4
B pear 20120108 1 4
B pear 20120109 1 4
C grape 20120101
C grape 20120102
C apple 20120103
C bannana 20120104
C grape 20120105
C pear 20120106 1 5
C apple 20120107
C apple 20120108
C apple 20120109
打破这种局面。
我们查看按“Date”排序的前一行(对于每个“Category”),看看它是否是一个不同的“SubCategory”,还有当前的cateogry = pear。如果是这样,我们用“1”标记行(与我们使用的无关,只是NON NULL)。
lag("SubCategory") over (partition by "Category" order by "Date") != "SubCategory"
and "SubCategory" = 'pear'
也是我们分配的第一行。这给了我们:
Category SubCate Date LAG("SU RN
---------- ------- ---------- ------- ----------
A bannana 20120101
A grape 20120102 bannana
A pear 20120103 grape 1
A pear 20120104 pear
A bannana 20120105 pear
A pear 20120106 bannana 1
A pear 20120107 pear
A apple 20120108 pear
A pear 20120109 apple 1
B apple 20120101
B bannana 20120102 apple
B apple 20120103 bannana
B bannana 20120104 apple
B pear 20120105 bannana 1
B pear 20120106 pear
B pear 20120107 pear
B pear 20120108 pear
B pear 20120109 pear
C grape 20120101
C grape 20120102 grape
C apple 20120103 grape
C bannana 20120104 apple
C grape 20120105 bannana
C pear 20120106 grape 1
C apple 20120107 pear
C apple 20120108 apple
C apple 20120109 apple
现在,我们只计算()在Date上再次排序的非空“RN”值(GID1的每个类别,而不是GID2 [gid2我们也按它排序!)。这些行是这样的:
count(rn) over (partition by "Category" order by "Date")
(GID1)
和
count(rn) over ( order by "Category", "Date")
(GID2)
答案 1 :(得分:0)
从未想过可以通过计数完成。辉煌。 从版本11r2开始,这可以通过使用递归分层查询来完成。
with r as (
select "Category"
, "SubCategory"
, "Date"
, row_number() over (partition by "SubCategory" order by "Category", "Date") rn
from tab
)
, fwd ( "Category", "SubCategory", "Date", rn, GID1, GID2) as (
select "Category"
, "SubCategory"
, "Date"
, rn
, 1
, 1
from r
where rn = 1
union all
select nxt."Category"
, nxt."SubCategory"
, nxt."Date"
, nxt.rn
, decode( nxt."Category"
, prev."Category", decode( nxt."Date"
, prev."Date" + 1, prev.gid1
, prev.gid1 + 1
)
, 1
) as gid1
, decode( nxt."Date"
, prev."Date" + 1, prev.gid2
, prev.gid2 + 1
) as gid2
from fwd prev
, r nxt
where prev.rn + 1= nxt.rn
and prev."SubCategory" = nxt."SubCategory"
)
select "Category"
, "SubCategory"
, "Date"
, decode( "SubCategory", 'pear', GID1, null ) as gid1
, decode( "SubCategory", 'pear', GID2, null ) as gid2
from fwd
order by "Category", "Date";
它产生相同的结果
Category SubCategory Date GID1 GID2
-------- ----------- ---------- ---------- ----------
A bannana 20120101
A grape 20120102
A pear 20120103 1 1
A pear 20120104 1 1
A bannana 20120105
A pear 20120106 2 2
A pear 20120107 2 2
A apple 20120108
A pear 20120109 3 3
B apple 20120101
B bannana 20120102
B apple 20120103
B bannana 20120104
B pear 20120105 1 4
B pear 20120106 1 4
B pear 20120107 1 4
B pear 20120108 1 4
B pear 20120109 1 4
C grape 20120101
C grape 20120102
C apple 20120103
C bannana 20120104
C grape 20120105
C pear 20120106 1 5
C apple 20120107
C apple 20120108
C apple 20120109
可以更自我解释。
如果从最终选择中移除decode
,它还会为所有其他子类别生成正确的GID1和GID2号码,而不仅仅是'梨'。
在此变体与@DazzaL
提供的变体之间进行选择需要进行成本比较