显示不在GROUP BY子句中的列而不对其应用聚合函数

时间:2016-02-25 15:48:02

标签: sql oracle group-by aggregate-functions

我知道之前曾问过这个问题,但我可能会有不同的情况, 我有这张桌子:

|  PK_DATA  |  EVENT_TYPE  |  DATE   |
-------------------------------------
|  123      |  D           |  12 DEC |
|  123      |  I           |  11 DEC |
|  123      |  U           |  10 DEC |
|  124      |  D           |  11 JAN |
|  124      |  U           |  12 JAN |
|  125      |  I           |  1 JAN  |
-------------------------------------

我希望查询按max(DATE)分组PK_DATE并同时提供相应的EVENT_TYPE ....即:

|  123  |  D  |  12 DEC |
|  124  |  U  |  12 JAN |
|  125  |  I  |  1 JAN  |

我想按PK_DATA进行分组并选择max(DATE),然后EVENT_TYPE将不会显示,直到对其应用聚合函数或将其添加到组子句中,并且都不会我想要什么......有什么帮助吗?

顺便说一句,我想避免任何嵌套查询,我知道它可以在两个步骤上完成,一个嵌套查询到组,然后再次连接主表和查询结果

3 个答案:

答案 0 :(得分:5)

你可以使用KEEP子句,它比运行窗口函数(如果你的数据集更大)明显更快,资源更少:

WITH data (PK_DATA, EVENT_TYPE, "DATE") AS (
  SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL
  SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL
  SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL
  SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL
  SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL
  SELECT 125, 'I', DATE'2015-01-01' FROM DUAL)
SELECT
  PK_DATA,
  MAX(EVENT_TYPE) KEEP (DENSE_RANK LAST ORDER BY "DATE") EVENT_TYPE,
  MAX("DATE") "DATE"
FROM
  data
GROUP BY
  PK_DATA

编辑:以下是ROW_NUMBERKEEP之间的比较:

PANELMANAGEMENT@panel_management> set autot trace stat
PANELMANAGEMENT@panel_management> SELECT
  2     INVOICEDATE,
  3     MAX(CREATED) V1,
  4     MAX(TOTALCOST) KEEP (DENSE_RANK LAST ORDER BY ORDER_ID) V2
  5  FROM
  6     ORDERS
  7  GROUP BY
  8     INVOICEDATE
  9  ORDER BY
 10     INVOICEDATE;

269 rows selected.

Elapsed: 00:00:05.03

PANELMANAGEMENT@panel_management> SELECT
  2     INVOICEDATE,
  3     CREATED V1,
  4     TOTALCOST V2
  5  FROM (
  6     SELECT
  7             INVOICEDATE,
  8             CREATED,
  9             TOTALCOST,
 10             ROW_NUMBER() OVER (PARTITION BY INVOICEDATE ORDER BY ORDER_ID DESC) FILTER
 11     FROM
 12             ORDERS)
 13  WHERE
 14     FILTER = 1
 15  ORDER BY
 16     INVOICEDATE;

269 rows selected.

Elapsed: 00:00:21.82

ORDERS表有大约1000万条记录和1 GB数据。主要区别在于分析函数需要分配更多的内存,因为它需要为所有1000万行分配行号,然后将这些行分解为生成的269行。使用KEEP Oracle知道每个INVOICEDATE只需要分配一行。此外,当你排序1000万行时,你需要内存来存储所有这些行。但是,如果您需要对1000万行进行排序并且每个组只保留一条记录,您只需分配单个记录,当您进行排序时,只需将其替换为更大/更小的记录即可。在这种情况下,分析函数需要大约100 MB的内存,而KEEP“无”。

答案 1 :(得分:3)

您可以使用window function为每个群组建立row_number

select *
from (
   select pk_data, event_type, date, 
       row_number() over (partition by pk_data order by date desc) rn
   from yourtable
) t
where rn = 1

如果您有疑虑,请使用rank代替row_number

答案 2 :(得分:0)

我找到了一个解决方案,但不确定它是否比Husqiv在性能方面更好,所以我会发布它以传播知识:

WITH data (PK_DATA, EVENT_TYPE, "DATE") AS ( SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL SELECT 125, 'I', DATE'2015-01-01' FROM DUAL) select EVENT_TYPE, "DATE", PK_DATA from (select EVENT_TYPE,"DATE",DATA_ID, max("DATE") over (PARTITION BY PK_DATA) max_date from data ) where "DATE" = max_date;