我正在尝试修复大型股票交易所数据库中的一些错误。一列(数量)在每个刻度上具有交易量,而其他列存储累积量(即,当天的先前刻度的总和)。在某些情况下,这第二列是错误的(不是很多,所以我们可以安全地假设相邻的滴答是错误的)。因此理论上解决方法很简单:只需搜索累积量减少的刻度(这就足够了),然后从最后一个刻度中选择累积量并将数量相加 目前的滴答声。问题是我一直在努力开始在oracle中执行此操作的查询,但由于我缺乏sql的专业知识,我正在努力。这就是我到目前为止所得到的:
update
(
select m.cumulative_volume, q.cum_volume_ant, q.quantity from
market_data_intraday_trades m
join
(
select * from
(select
product_key,
sequence_number,
lead(product_key) over (order by product_key, sequence_number) as product_key_ant,
to_char(trade_date_time, 'yyyymmdd') as fecha,
to_char(lag(trade_date_time) over (order by product_key, sequence_number), 'yyyymmdd') as fecha_ant,
cumulative_volume,
lead(cumulative_volume) over (order by product_key, sequence_number) as cum_volume_ant,
cumulative_volume - lead(cumulative_volume) over (order by product_key, sequence_number) as dif
from market_data_intraday_trades)
where product_key = product_key_ant
and fecha = fecha_ant
and dif < 0
and rownum < 10
) q
on m.sequence_number = q.sequence_number
)
set m.cumulative_volume = q.cum_volume_ant + q.quantity
目前的问题是我似乎无法在外部计算中使用内部查询中的数量。
使用临时表或pl / sql或游标可能所有这些都会更清晰和/或更容易,但由于公司策略,我没有权利这样做,只需选择和更新。
如果你能指点我解决这个问题,我将非常感激。
提前致谢!
PS。 Fecha是西班牙语的约会,以防万一:)
答案 0 :(得分:7)
这是一些测试数据。如您所见,第四行的CUMULATIVE_VOLUME错误。
SQL> select product_key
2 , trade_date_time
3 , quantity
4 , cumulative_volume
5 , sum (quantity) over (partition by product_key order by sequence_number) as running_total
6 from market_data_intraday_trades
7 order by sequence_number
8 /
PROD TRADE_DAT QUANTITY CUMULATIVE_VOLUME RUNNING_TOTAL
---- --------- ---------- ----------------- -------------
ORCL 23-JUN-10 100 100 100
ORCL 23-JUN-10 50 150 150
ORCL 25-JUN-10 100 250 250
ORCL 26-JUN-10 100 250 350
ORCL 26-JUN-10 50 400 400
ORCL 27-JUN-10 75 475 475
6 rows selected.
SQL>
最简单的解决方案是使用计算的运行总计更新所有行:
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 /
6 rows updated.
SQL> select product_key
2 , trade_date_time
3 , quantity
4 , cumulative_volume
5 , sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , rowid as row_id
8 from market_data_intraday_trades
9 order by sequence_number
10 /
PROD TRADE_DAT QUANTITY CUMULATIVE_VOLUME RUNNING_TOTAL
---- --------- ---------- ----------------- -------------
ORCL 23-JUN-10 100 100 100
ORCL 23-JUN-10 50 150 150
ORCL 25-JUN-10 100 250 250
ORCL 26-JUN-10 100 350 350
ORCL 26-JUN-10 50 400 400
ORCL 27-JUN-10 75 475 475
6 rows selected.
SQL>
但是,如果您有大量数据并且您真的不希望所有这些不必要的更新,那么再次使用相同的查询来限制点击次数:
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 where m.rowid in
14 ( select inq.row_id
15 from (
16 select sum (quantity) over (partition by product_key
17 order by sequence_number) as running_total
18 , cumulative_volume
19 , rowid as row_id
20 from market_data_intraday_trades
21 ) inq
22 where m.cumulative_volume != running_total
23 )
24
SQL> /
1 row updated.
SQL> select product_key
2 , trade_date_time
3 , quantity
4 , cumulative_volume
5 , sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 from market_data_intraday_trades
8 order by sequence_number
9 /
PROD TRADE_DAT QUANTITY CUMULATIVE_VOLUME RUNNING_TOTAL
---- --------- ---------- ----------------- -------------
ORCL 23-JUN-10 100 100 100
ORCL 23-JUN-10 50 150 150
ORCL 25-JUN-10 100 250 250
ORCL 26-JUN-10 100 350 350
ORCL 26-JUN-10 50 400 400
ORCL 27-JUN-10 75 475 475
6 rows selected.
SQL>
我尝试了尼古拉斯使用MERGE的建议。如果您使用10g或更高,那么这将有效。您需要最新版本的Oracle,因为9i不支持带有UPDATE的MERGE但没有INSERT(并且8i根本不支持MERGE)。
SQL> merge into market_data_intraday_trades m
2 using ( select running_total
3 , row_id
4 from
5 ( select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 )
11 where cumulative_volume != running_total
12 ) inq
13 on ( m.rowid = inq.row_id )
14 when matched then
15 update set m.cumulative_volume = inq.running_total
16 /
1 row merged.
SQL>
此解决方案比其他解决方案更整洁。
答案 1 :(得分:4)
只是将性能比较添加到APC的答案:
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 /
6 rows updated.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4mgw11769k00r, child number 0
-------------------------------------
update market_data_intraday_trades m set m.cumulative_volume = ( select inq.running_total
from ( select sum (quantity) over (partition by product_key
order by sequence_number) as running_total
, cumulative_volume , rowid as row_id from
market_data_intraday_trades ) inq where m.rowid = inq.row_id )
Plan hash value: 3204855846
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------
| 1 | UPDATE | MARKET_DATA_INTRADAY_TRADES | 1 | | 0 |00:00:00.01 | 35 |
| 2 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
|* 3 | VIEW | | 6 | 6 | 6 |00:00:00.01 | 18 |
| 4 | WINDOW SORT | | 6 | 6 | 36 |00:00:00.01 | 18 |
| 5 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 6 | 6 | 36 |00:00:00.01 | 18 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("INQ"."ROW_ID"=:B1)
25 rows selected.
看看那些36岁。那是O(N ^ 2)。
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 where m.rowid in
14 ( select inq.row_id
15 from (
16 select sum (quantity) over (partition by product_key
17 order by sequence_number) as running_total
18 , cumulative_volume
19 , rowid as row_id
20 from market_data_intraday_trades
21 ) inq
22 where m.cumulative_volume != running_total
23 )
24
SQL> /
1 row updated.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8fg3vnav1t742, child number 0
-------------------------------------
update market_data_intraday_trades m set m.cumulative_volume = ( select inq.running_total
from ( select sum (quantity) over (partition by product_key
order by sequence_number) as running_total ,
cumulative_volume , rowid as row_id from
market_data_intraday_trades ) inq where m.rowid = inq.row_id )
where m.rowid in ( select inq.row_id from ( select sum (quantity)
over (partition by product_key order by
sequence_number) as running_total , cumulative_volume
, rowid as row_id from market_data_intraday_trades ) inq
where m.cumulative_volume != running_total )
Plan hash value: 1087408236
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------------
| 1 | UPDATE | MARKET_DATA_INTRADAY_TRADES | 1 | | 0 |00:00:00.01 | 14 |
|* 2 | HASH JOIN SEMI | | 1 | 5 | 1 |00:00:00.01 | 6 |
| 3 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
| 4 | VIEW | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 5 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 6 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
|* 7 | VIEW | | 1 | 6 | 1 |00:00:00.01 | 4 |
| 8 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 4 |
| 9 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("M".ROWID="INQ"."ROW_ID")
filter("M"."CUMULATIVE_VOLUME"<>"RUNNING_TOTAL")
7 - filter("INQ"."ROW_ID"=:B1)
36 rows selected.
那好多了。
SQL> merge into market_data_intraday_trades mdit1
2 using ( select product_key
3 , sequence_number
4 , running_total
5 from ( select product_key
6 , sequence_number
7 , cumulative_volume
8 , sum(quantity) over (partition by product_key order by sequence_number) as running_total
9 from market_data_intraday_trades
10 )
11 where cumulative_volume != running_total
12 ) mdit2
13 on ( mdit1.product_key = mdit2.product_key
14 and mdit1.sequence_number = mdit2.sequence_number
15 )
16 when matched then
17 update set mdit1.cumulative_volume = mdit2.running_total
18 /
1 row merged.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID cjafdk3jg4gzz, child number 0
-------------------------------------
merge into market_data_intraday_trades mdit1 using ( select product_key , sequence_number
, running_total from ( select product_key , sequence_number
, cumulative_volume , sum(quantity) over (partition by
product_key order by sequence_number) as running_total from
market_data_intraday_trades ) where cumulative_volume != running_total )
mdit2 on ( mdit1.product_key = mdit2.product_key and mdit1.sequence_number =
mdit2.sequence_number ) when matched then update set mdit1.cumulative_volume =
mdit2.running_total
Plan hash value: 2367693855
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 1 | MERGE | MARKET_DATA_INTRADAY_TRADES | 1 | | 1 |00:00:00.01 | 9 |
| 2 | VIEW | | 1 | | 1 |00:00:00.01 | 6 |
|* 3 | HASH JOIN | | 1 | 6 | 1 |00:00:00.01 | 6 |
|* 4 | VIEW | | 1 | 6 | 1 |00:00:00.01 | 3 |
| 5 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 6 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
| 7 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("MDIT1"."PRODUCT_KEY"="PRODUCT_KEY" AND "MDIT1"."SEQUENCE_NUMBER"="SEQUENCE_NUMBER")
4 - filter("CUMULATIVE_VOLUME"<>"RUNNING_TOTAL")
31 rows selected.
但是合并比较少,只需少一个表扫描。
此致 罗布。
答案 2 :(得分:3)
您是否尝试过MERGE声明?也许并且根据您的Oracle版本,它可能是一种调查方式,至少它可以使您的陈述更简单。
尼古拉斯。