我有一个查询,现在运行非常慢。此查询具有对我们的股票头寸的组合查询(我称其为POSITION_QUERY
,在给定日期在一个交易所中有一行用于一个股票代码的交易),然后加入(我称其为FIRST JOIN
)的股票价格表以获取价格,则加入条件位于三列中:股票代码,交易所和交易日期。然后我需要一个SECOND JOIN
,因为每只股票都属于一个复合指数(在POSITION_QUERY
中,每一行都有指示索引代码和索引交易位置的列)。
所以我的查询看起来像这样:
SELECT * FROM
POSITION_QUERY t1
JOIN DAILY_PRICE t2
on t1.STOCK_CODE = t2.STOCK_CODE
and t1.STOCK_EXCHANGE = t2.EXCHANGE
and t2.TRADE_DATE = 20181121
JOIN DAILY_PRICE t3
on t1.INDEX_CODE = t3.STOCK_CODE
and t1.INDEX_EXCHANGE = t3.EXCHANGE
and t3.TRADE_DATE = 20181121
现在查询真的很慢:大约需要3分钟才能返回50行结果。正如我提到的,POSITION_QUERY
实际上是一个查询,而不是现有的表。但是,如果我运行SELECT * FROM POSITION_QUERY
还是很快的(我只在POSITION_QUERY
内获得20181121的排名,所以这个查询的数量已经是50,正如我之前提到的那样)。 DAILY_PRICE
是view
,但几乎映射到一个现有表,并且该表的联接列上都有索引。
对我来说奇怪的是,如果我仅执行POSITION_QUERY
或POSITION_QUERY
或FIRST JOIN
(即,将DAILY_PRICE
与第一组条件一起加入),或POSITION_QUERY
与SECOND JOIN
(将DAILY_PRICE
与第二组条件结合在一起),所有三个查询的运行速度都非常快(不到一秒钟)。
我检查了实际的执行计划,两个联接的计划和一个联接的计划非常相似,但是在两个联接计划中,有一个table spool (lazy spool)
,其成本为49%。表假脱机操作符的输出列表是POSOTION_QUERY
,所以我猜它正在存储“ POSITION_QUERY”结果(但是为什么它不是连续联接?)。我很难解释执行计划,所以我不知道这是否是问题以及如何解决。
更新: 我已经粘贴了执行计划以及真实的数据表结构和查询。链接为:Execution plan
答案 0 :(得分:1)
尝试一下:
WITH DAILY_PRICE_TODAY (STOCK_CODE, EXCHANGE)
AS
-- Define the CTE query.
(
SELECT STOCK_CODE, EXCHANGE
FROM DAILY_PRICE
WHERE TRADE_DATE = 20181121
)
SELECT * FROM
POSITION_QUERY t1
JOIN DAILY_PRICE_TODAY t2
on t1.STOCK_CODE = t2.STOCK_CODE
and t1.STOCK_EXCHANGE = t2.EXCHANGE
JOIN DAILY_PRICE_TODAY t3
on t1.INDEX_CODE = t3.STOCK_CODE
and t1.INDEX_EXCHANGE = t3.EXCHANGE
答案 1 :(得分:1)
数据类型是什么?在生成520,000行具有隐式数据类型的样本数据后,只需3秒钟即可运行查询:
CREATE TABLE POSITION_QUERY (STOCK_CODE INT, STOCK_EXCHANGE INT, INDEX_CODE INT, INDEX_EXCHANGE INT, TRADE_DATE INT)
CREATE TABLE DAILY_PRICE (STOCK_CODE INT, EXCHANGE INT, TRADE_DATE INT)
-- Put 520,000 rows of sample data in POSITION_QUERY.
;WITH CTE AS (
SELECT 1 AS A
UNION ALL
SELECT A + 1
FROM CTE
WHERE A < 10
),
CTE_DATE AS (
SELECT CAST(GETDATE() AS DATE) AS D
UNION ALL
SELECT DATEADD(DAY, -1, D)
FROM CTE_DATE
WHERE D > '10/1/2018'
)
INSERT INTO POSITION_QUERY
SELECT C1.A, C2.A, C3.A, C4.A, FORMAT(C5.D, 'yyyyMMdd')
FROM CTE C1, CTE C2, CTE C3, CTE C4, CTE_DATE C5
OPTION (MAXRECURSION 0)
-- Put 5,200 rows of sample data in DAILY_PRICE that match all POSITION_QUERY records
;WITH CTE AS (
SELECT 1 AS A
UNION ALL
SELECT A + 1
FROM CTE
WHERE A < 10
),
CTE_DATE AS (
SELECT CAST(GETDATE() AS DATE) AS D
UNION ALL
SELECT DATEADD(DAY, -1, D)
FROM CTE_DATE
WHERE D > '10/1/2018'
)
INSERT INTO DAILY_PRICE
SELECT C1.A, C2.A, FORMAT(C3.D, 'yyyyMMdd')
FROM CTE C1, CTE C2, CTE_DATE C3
OPTION (MAXRECURSION 0)
-- Create nonclustered indexes on both tables' pertinent columns.
CREATE NONCLUSTERED INDEX IDX_POSITION_QUERY
ON [dbo].[POSITION_QUERY] ([STOCK_CODE],[STOCK_EXCHANGE])
INCLUDE ([INDEX_CODE],[INDEX_EXCHANGE],[TRADE_DATE])
GO
CREATE NONCLUSTERED INDEX IDX_DAILY_PRICE
ON DAILY_PRICE (STOCK_CODE, EXCHANGE, TRADE_DATE)
GO
-- Finally, run the query. It takes 3 seconds to return 520k records.
SELECT * FROM
POSITION_QUERY t1
JOIN DAILY_PRICE t2
on t1.STOCK_CODE = t2.STOCK_CODE
and t1.STOCK_EXCHANGE = t2.EXCHANGE
and t2.TRADE_DATE = 20181121
JOIN DAILY_PRICE t3
on t1.INDEX_CODE = t3.STOCK_CODE
and t1.INDEX_EXCHANGE = t3.EXCHANGE
and t3.TRADE_DATE = 20181121
这是执行计划:
https://www.brentozar.com/pastetheplan/?id=BkSgin7C7
您可以粘贴执行计划吗?某处可能存在错误的类型转换。即使没有我创建的索引,也只需要14秒钟。
答案 2 :(得分:0)
如果无法自己进行测试,我可以提供一种我喜欢采用的策略,该策略通常可以加快查询结果的速度。也就是说,将您可以存储的内容存储在临时表中并对其进行精确索引,以满足主查询的需求。在这种情况下,您似乎可以从DAILY_PRICE
分离出所需的数据,然后在STOCK_CODE
和EXCHANGE
上建立索引,就像这样:
DROP TABLE IF EXISTS #temp;
SELECT *
INTO #temp
FROM DAILY_PRICE
WHERE TRADE_DATE = 20181121;
CREATE INDEX [IX1] ON #temp(STOCK_CODE, EXCHANGE);
SELECT *
FROM POSITION_QUERY t1
JOIN #temp t2
on t1.STOCK_CODE = t2.STOCK_CODE
and t1.STOCK_EXCHANGE = t2.EXCHANGE
JOIN #temp t3
on t1.INDEX_CODE = t3.STOCK_CODE
and t1.INDEX_EXCHANGE = t3.EXCHANGE
此 可能会导致更快的结果,因为它给执行计划者带来了其他选择,只能使用您提供的内容,而不是尝试使用主体表,这有时可能导致昂贵的操作例如假脱机,散列或并行化。