我知道之前曾问过这个问题,但我可能会有不同的情况, 我有这张桌子:
| PK_DATA | EVENT_TYPE | DATE |
-------------------------------------
| 123 | D | 12 DEC |
| 123 | I | 11 DEC |
| 123 | U | 10 DEC |
| 124 | D | 11 JAN |
| 124 | U | 12 JAN |
| 125 | I | 1 JAN |
-------------------------------------
我希望查询按max(DATE)
分组PK_DATE
并同时提供相应的EVENT_TYPE
....即:
| 123 | D | 12 DEC |
| 124 | U | 12 JAN |
| 125 | I | 1 JAN |
我想按PK_DATA
进行分组并选择max(DATE)
,然后EVENT_TYPE
将不会显示,直到对其应用聚合函数或将其添加到组子句中,并且都不会我想要什么......有什么帮助吗?
顺便说一句,我想避免任何嵌套查询,我知道它可以在两个步骤上完成,一个嵌套查询到组,然后再次连接主表和查询结果
答案 0 :(得分:5)
你可以使用KEEP
子句,它比运行窗口函数(如果你的数据集更大)明显更快,资源更少:
WITH data (PK_DATA, EVENT_TYPE, "DATE") AS (
SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL
SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL
SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL
SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL
SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL
SELECT 125, 'I', DATE'2015-01-01' FROM DUAL)
SELECT
PK_DATA,
MAX(EVENT_TYPE) KEEP (DENSE_RANK LAST ORDER BY "DATE") EVENT_TYPE,
MAX("DATE") "DATE"
FROM
data
GROUP BY
PK_DATA
编辑:以下是ROW_NUMBER
和KEEP
之间的比较:
PANELMANAGEMENT@panel_management> set autot trace stat
PANELMANAGEMENT@panel_management> SELECT
2 INVOICEDATE,
3 MAX(CREATED) V1,
4 MAX(TOTALCOST) KEEP (DENSE_RANK LAST ORDER BY ORDER_ID) V2
5 FROM
6 ORDERS
7 GROUP BY
8 INVOICEDATE
9 ORDER BY
10 INVOICEDATE;
269 rows selected.
Elapsed: 00:00:05.03
PANELMANAGEMENT@panel_management> SELECT
2 INVOICEDATE,
3 CREATED V1,
4 TOTALCOST V2
5 FROM (
6 SELECT
7 INVOICEDATE,
8 CREATED,
9 TOTALCOST,
10 ROW_NUMBER() OVER (PARTITION BY INVOICEDATE ORDER BY ORDER_ID DESC) FILTER
11 FROM
12 ORDERS)
13 WHERE
14 FILTER = 1
15 ORDER BY
16 INVOICEDATE;
269 rows selected.
Elapsed: 00:00:21.82
ORDERS
表有大约1000万条记录和1 GB数据。主要区别在于分析函数需要分配更多的内存,因为它需要为所有1000万行分配行号,然后将这些行分解为生成的269行。使用KEEP
Oracle知道每个INVOICEDATE
只需要分配一行。此外,当你排序1000万行时,你需要内存来存储所有这些行。但是,如果您需要对1000万行进行排序并且每个组只保留一条记录,您只需分配单个记录,当您进行排序时,只需将其替换为更大/更小的记录即可。在这种情况下,分析函数需要大约100 MB的内存,而KEEP
“无”。
答案 1 :(得分:3)
您可以使用window function
为每个群组建立row_number
:
select *
from (
select pk_data, event_type, date,
row_number() over (partition by pk_data order by date desc) rn
from yourtable
) t
where rn = 1
如果您有疑虑,请使用rank
代替row_number
。
答案 2 :(得分:0)
我找到了一个解决方案,但不确定它是否比Husqiv在性能方面更好,所以我会发布它以传播知识:
WITH data (PK_DATA, EVENT_TYPE, "DATE") AS (
SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL
SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL
SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL
SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL
SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL
SELECT 125, 'I', DATE'2015-01-01' FROM DUAL)
select EVENT_TYPE, "DATE", PK_DATA
from (select EVENT_TYPE,"DATE",DATA_ID, max("DATE") over (PARTITION BY PK_DATA) max_date
from data ) where "DATE" = max_date;