我正在尝试从供应商设计的表中删除大量数据。它被过度索引,任何更新/插入/删除都很痛苦。我无法删除NC索引 我正在测试批量删除数据的不同方法。我今天发现,当我不使用变量来保存日期时,下面的语句要快得多。为什么这会产生这样的差异?使用TempDB?您是否有更愿意分享的解决方案?使用getdate()替换显式类型日期时性能更差。
--example 1:
--very slow
declare @cleanday as datetime
select @cleanday = dateadd(day,-60,DATEADD(dd, 0, DATEDIFF(dd, 0, CAST('2013-12-22' as datetime))))
delete ES1
from ( select top (10000) es.id1 from es where
es.ID2 in
(
21,
20,
19,
151
)
and es.DateCreated < @cleanday
order by es.id1
) ES1
--example 2:
--much faster
delete ES1
from ( select top (10000) es.id1 from es where
es.ID2 in
(
21,
20,
19,
151
)
and es.DateCreated < dateadd(day,-60,CAST('2013-12-22' as datetime))
order by es.id1
) ES1
答案 0 :(得分:2)
/* Some Test Data */
CREATE TABLE Stats_Test_Table (ID INT NOT NULL PRIMARY KEY IDENTITY(1,1), VALUE INT)
GO
DECLARE @i INT = 1
WHILE (@i <= 100)
BEGIN
INSERT INTO Stats_Test_Table
VALUES (@i)
SET @i = @i + 1;
END
GO
/*
Execute the following command to flush any executiong plan already
existing in your chache
**WARNING**
DO NOT execute this command on your production server as it will
flush all the created execution plan for all the queries.
I guess you are doing all this on a test server anyway.
*/
-- Clear cache
DBCC FREEPROCCACHE;
GO
/*
Four Queries with exactly the same syntax only difference is
for 1st Two queries I have Hardcoded the value in WHERE clause
for last two queries I have used an INT parameter in WHERE clause
*/
--Query 1 with Hardcoded value in WHERE clause
SELECT *
FROM Stats_Test_Table
WHERE ID = 50;
GO
--Query 2 with Hardcoded value in WHERE clause
SELECT *
FROM Stats_Test_Table
WHERE ID = 51;
GO
--Query 3 with Variable @ID_1 value in WHERE clause
DECLARE @ID_1 INT;
SET @ID_1 = 52;
SELECT *
FROM Stats_Test_Table
WHERE ID = @ID_1;
GO
--Query 4 with Variable @ID_2 value in WHERE clause
DECLARE @ID_2 INT;
SET @ID_2 = 52;
SELECT *
FROM Stats_Test_Table
WHERE ID = @ID_2;
GO
/*
Now execute the following statement to get all the cached execution plans
remeber once you have cleared you CACHE memory with the DBCC command
you will have to execute all the above queries and the following one as soon
as because sql server is constantly executing queries behind the scense but we
dont see them. so the longer you take more results you will have in your result
set of the following query.
*/
-- Query DMVs for execution plan reuse statistics
SELECT stats.execution_count AS [Execution_Count]
,p.size_in_bytes AS [Size]
,[sql].[text] AS [plan_text]
FROM sys.dm_exec_cached_plans p
OUTER APPLY sys.dm_exec_sql_text(p.plan_handle) sql
JOIN sys.dm_exec_query_stats stats
ON stats.plan_handle = p.plan_handle
ORDER BY [plan_text]
缓存执行计划
╔═════════════════╦═══════╦═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ Execution_Count ║ Size ║ plan_text ║
╠═════════════════╬═══════╬═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
║ 1 ║ 40960 ║ --Query 3 with Variable @ID_1 value in WHERE clause DECLARE @ID_1 INT; SET @ID_1 = 52; SELECT * FROM Stats_Test_Table WHERE ID = @ID_1; ║
║ 2 ║ 32768 ║ (@1 tinyint)SELECT * FROM [Stats_Test_Table] WHERE [ID]=@1 ║
║ 1 ║ 40960 ║ --Query 4 with Variable @ID_2 value in WHERE clause DECLARE @ID_2 INT; SET @ID_2 = 52; SELECT * FROM Stats_Test_Table WHERE ID = @ID_2; ║
╚═════════════════╩═══════╩═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
我总共执行了四个查询。比方说Q1,Q2,Q3和Q4。 Sql server为我创建了3个执行计划。
查询1和2
(@1 tinyint)SELECT * FROM [Stats_Test_Table] WHERE [ID]=@1
现在,如果你仔细看看上面的查询Sql Server的结果集 为Q1制定了一个执行计划,并将其重新用于第二季度。两人都有 where子句中的硬编码值。
Execution_Counts 2的执行计划附加了一个变量 它 = @ 1 。它被称为自动参数化。 Sql Server添加了一个 执行计划的参数,并将其重新用于下一次执行。
查询3和4
现在,对于查询3和4,我们有两个单独的执行计划。甚至 虽然两个查询和有些相同,但这次sql server决定 不要使用相同的执行计划并为每个执行计划创建一个新的计划 查询。
<强>结论强>
当查询传递参数而不是硬编码值sql时 每次查询时,服务器都会创建一个新的执行计划 执行。
在你的情况下,你在1st Query中传递了一个参数,然后是第二个 查询您传递了硬编码值,因此第二次查询更快 第一个:)。
答案 1 :(得分:0)
通过将where子句(或等效项)移动到变量与硬编码值之间/之后,您可以经常看到由于查询优化器而导致的性能差异。即,当硬编码时,优化器可能会认识到使用特定索引是最佳的
有时您可以通过更改隔离级别获得巨大优势,尤其是当您执行此操作时,可以通过订单批量停止批量删除数据。
答案 2 :(得分:0)
不要过度思考它。只需在删除中使用TOP:http://technet.microsoft.com/en-us/library/ms175486%28v=sql.105%29.aspx
示例:
DELETE TOP 10000 FROM Table01
有关于随机或有序删除的规定(您必须指定ORDER BY子句)。根据你想做什么以及你想做什么,这可以通过相对较少的工作来实现你想要的。