I am from SQL Server background, so my skill in Oracle is minimum. It looks like Partition by
is better than max
in terms of performance. Or do I use rownum
to archive my result table?
I have the following table - TableW.
| P_TYPE | TRX_DATE | PROGRAM_NO | REF_NO | SEQ_ID | Select
|-------------|----------------|------------|-----------|--------|
| 'Local' | 2016/9/5 14:37 | C1 | null | E1 | Yes (latest in Sept 5)
| 'Local' | 2016/9/5 14:36 | C1 | null | E1 |
| 'Local' | 2016/9/5 11:08 | C1 | null | E1 |
|-------------|----------------|------------|-----------|--------|
| 'Local' | 2016/9/2 15:16 | C1 | null | E1 | Yes (latest in Sept 2)
|-------------|----------------|------------|-----------|--------|
| 'Local' | 2016/9/1 15:20 | C1 | null | E1 | Yes (latest in Sept 1)
| 'Local' | 2016/9/1 14:33 | C1 | null | E1 |
|-------------|----------------|------------|-----------|--------|
| '3rd Party' | 2016/9/4 18:00 | null | D1 | E2 | Yes
| '3rd Party' | 2016/9/4 17:55 | null | D1 | E2 |
Here is what I want to get:
For column P_TYPE, if it is value of 'Local', use column PROGRAM_NO and SEQ_ID. Otherwise, use REF_NO and SEQ_ID. If value in column P_TYPE is same, check TRX_DATE. If column TRX_DATE indicates the same date, pick the one with latest time stamp. Another day? Another entry with latest time stamp.
| P_TYPE | TRX_DATE | PROGRAM_NO | REF_NO | SEQ_ID |
|-------------|----------------|------------|-----------|--------|
| 'Local' | 2016/9/5 14:37 | C1 | null | E1 |
| 'Local' | 2016/9/2 15:16 | C1 | null | E1 |
| 'Local' | 2016/9/1 15:20 | C1 | null | E1 |
| '3rd Party' | 2016/9/4 18:00 | null | D1 | E2 |
A script that I receive is to use SELECT MAX
in the WHERE clause
:
SELECT *
FROM TableW a
WHERE TRX_DATE =
CASE P_TYPE
WHEN 'Local' THEN
(SELECT MAX(TRX_DATE) FROM TableW
WHERE PROGRAM_NO = a.PROGRAM_NO AND SEQ_ID = a.SEQ_ID)
ELSE
(SELECT MAX(TRX_DATE) FROM TableW
WHERE REF_NO = a.REF_NO AND SEQ_ID = a.SEQ_ID)
END
ORDER BY TRX_DATE desc, REF_NO ASC, SEQ_ID;
It does the job. However with some research, it seems like partition by
is not that costly. Refer: Tune SQL statement with max subquery
I try to rewrite the query as:
SELECT *
FROM (
SELECT *,
CASE P_TYPE
WHEN 'Local' THEN
MAX(TRX_DATE) OVER (PARTITION BY PROGRAM_NO, SEQ_ID)
ELSE
MAX(TRX_DATE) OVER (PARTITION BY REF_NO, SEQ_ID)
END AS MAX_TRX_DATE
FROM TableW
WHERE P_TYPE = 'Local'
)
WHERE TRX_DATE = MAX_TRX_DATE
However, I get only this:
| P_TYPE | TRX_DATE | PROGRAM_NO | REF_NO | SEQ_ID |
|-------------|----------------|------------|-----------|--------|
| 'Local' | 2016/9/5 14:37 | C1 | null | E1 |
Any guideline please. If possible, please illustrate your suggestion with statistics. Thanks.
EDIT: It looks like using row_number and partition by will greatly reduce the execution plan and even the time?
| CASE | OPERATION | CARDINALITY | COST | LAST CR | LAST ELAPSED |
| | | | | BUFFER GETS | TIME |
|------------------|------------------|-------------|------|-------------|---------------|
| 1 - max() in | SELECT STATEMENT | | 76 | | |
| where clause | SORT (ORDER BY) | 1 | 76 | 477 | 3602 |
|------------------|------------------|-------------|------|-------------|---------------|
| 2 - row_number | SELECT STATEMENT | | 18 | | |
| | SORT (ORDER BY) | 8 | 18 | 53 | 607 |
|------------------|------------------|-------------|------|-------------|---------------|
答案 0 :(得分:2)
对于Local
行,您需要包含定义窗口分区的日期,因为PROGRAM_NO, REF_NO
的所有值对于这些行都是相同的:
select *
from (
SELECT *,
CASE P_TYPE
when 'Local' then
row_number() over (partition by program_no, seq_id, trunc(trx_date) order by trx_date desc)
else
row_number() over (partition by ref_no, seq_id order by trx_date desc)
end as rn
FROM TableW a
) t
where rn = 1;
在线示例:http://rextester.com/CZTY80559
(该示例使用Postgres,但除了“忽略”时间戳的时间部分的不同方式之外,它在Oracle中将是相同的)