我创建了一个用户定义的函数来获得包含'WHERE col IN(...)'的查询的性能,如下例所示:
SELECT myCol1, myCol2
FROM myTable
WHERE myCol3 IN (100, 200, 300, ..., 4900, 5000);
查询是从Web应用程序生成的,在某些情况下要复杂得多。 函数定义如下所示:
CREATE FUNCTION [dbo].[udf_CSVtoIntTable]
(
@CSV VARCHAR(MAX),
@Delimiter CHAR(1) = ','
)
RETURNS
@Result TABLE
(
[Value] INT
)
AS
BEGIN
DECLARE @CurrStartPos SMALLINT;
SET @CurrStartPos = 1;
DECLARE @CurrEndPos SMALLINT;
SET @CurrEndPos = 1;
DECLARE @TotalLength SMALLINT;
-- Remove space, tab, linefeed, carrier return
SET @CSV = REPLACE(@CSV, ' ', '');
SET @CSV = REPLACE(@CSV, CHAR(9), '');
SET @CSV = REPLACE(@CSV, CHAR(10), '');
SET @CSV = REPLACE(@CSV, CHAR(13), '');
-- Add extra delimiter if needed
IF NOT RIGHT(@CSV, 1) = @Delimiter
SET @CSV = @CSV + @Delimiter;
-- Get total string length
SET @TotalLength = LEN(@CSV);
WHILE @CurrStartPos < @TotalLength
BEGIN
SET @CurrEndPos = CHARINDEX(@Delimiter, @CSV, @CurrStartPos);
INSERT INTO @Result
VALUES (CAST(SUBSTRING(@CSV, @CurrStartPos, @CurrEndPos - @CurrStartPos) AS INT));
SET @CurrStartPos = @CurrEndPos + 1;
END
RETURN
END
该函数旨在像这样使用(或作为INNER JOIN):
SELECT myCol1, myCol2
FROM myTable
WHERE myCol3 IN (
SELECT [Value]
FROM dbo.udf_CSVtoIntTable('100, 200, 300, ..., 4900, 5000', ',');
在我的情况下,有没有人对我的功能或其他改善性能的方法有一些优化的想法? 我错过了什么缺点吗?
我正在使用MS SQL Server 2005 Std和.NET 2.0框架。
答案 0 :(得分:1)
我不确定性能提升,但我会将其用作内连接并远离内部选择语句。
答案 1 :(得分:1)
在WHERE子句中使用UDF或(更糟糕的是)子查询要求麻烦。优化器有时会正确使用它,但经常会出错,并为查询中的每一行评估函数一次,这是您不想要的。
如果您的参数是静态的(它们似乎是)并且您可以发出多语句批处理,我会将UDF的结果加载到表变量中,然后对表变量使用连接来进行过滤。这应该更可靠。
答案 2 :(得分:1)
该循环将扼杀性能!
创建一个这样的表:
CREATE TABLE Numbers
(
Number int not null primary key
)
包含值为1到8000左右的行并使用此函数:
CREATE FUNCTION [dbo].[FN_ListAllToNumberTable]
(
@SplitOn char(1) --REQUIRED, the character to split the @List string on
,@List varchar(8000) --REQUIRED, the list to split apart
)
RETURNS
@ParsedList table
(
RowNumber int
,ListValue varchar(500)
)
AS
BEGIN
/*
DESCRIPTION: Takes the given @List string and splits it apart based on the given @SplitOn character.
A table is returned, one row per split item, with a columns named "RowNumber" and "ListValue".
This function workes for fixed or variable lenght items.
Empty and null items will be included in the results set.
PARAMETERS:
@List varchar(8000) --REQUIRED, the list to split apart
@SplitOn char(1) --OPTIONAL, the character to split the @List string on, defaults to a comma ","
RETURN VALUES:
a table, one row per item in the list, with a column name "ListValue"
TEST WITH:
----------
SELECT * FROM dbo.FN_ListAllToNumTable(',','1,12,123,1234,54321,6,A,*,|||,,,,B')
DECLARE @InputList varchar(200)
SET @InputList='17;184;75;495'
SELECT
'well formed list',LEFT(@InputList,40) AS InputList,h.Name
FROM Employee h
INNER JOIN dbo.FN_ListAllToNumTable(';',@InputList) dt ON h.EmployeeID=dt.ListValue
WHERE dt.ListValue IS NOT NULL
SET @InputList='17;;;184;75;495;;;'
SELECT
'poorly formed list join',LEFT(@InputList,40) AS InputList,h.Name
FROM Employee h
INNER JOIN dbo.FN_ListAllToNumTable(';',@InputList) dt ON h.EmployeeID=dt.ListValue
SELECT
'poorly formed list',LEFT(@InputList,40) AS InputList, ListValue
FROM dbo.FN_ListAllToNumTable(';',@InputList)
**/
/*this will return empty rows, and row numbers*/
INSERT INTO @ParsedList
(RowNumber,ListValue)
SELECT
ROW_NUMBER() OVER(ORDER BY number) AS RowNumber
,LTRIM(RTRIM(SUBSTRING(ListValue, number+1, CHARINDEX(@SplitOn, ListValue, number+1)-number - 1))) AS ListValue
FROM (
SELECT @SplitOn + @List + @SplitOn AS ListValue
) AS InnerQuery
INNER JOIN Numbers n ON n.Number < LEN(InnerQuery.ListValue)
WHERE SUBSTRING(ListValue, number, 1) = @SplitOn
RETURN
END /*Function FN_ListAllToNumTable*/
我有其他版本不返回空行或空行,只返回项而不返回行号等。查看标题注释,看看如何使用它作为JOIN的一部分,这是很多比where子句快。
答案 3 :(得分:1)
CLR解决方案没有给我一个好的表现,所以我将使用递归查询。所以这里是我将使用的SP的定义(主要基于Erland Sommarskogs的例子):
CREATE FUNCTION [dbo].[priudf_CSVtoIntTable]
(
@CSV VARCHAR(MAX),
@Delimiter CHAR(1) = ','
)
RETURNS
@Result TABLE
(
[Value] INT
)
AS
BEGIN
-- Remove space, tab, linefeed, carrier return
SET @CSV = REPLACE(@CSV, ' ', '');
SET @CSV = REPLACE(@CSV, CHAR(9), '');
SET @CSV = REPLACE(@CSV, CHAR(10), '');
SET @CSV = REPLACE(@CSV, CHAR(13), '');
WITH csvtbl(start, stop) AS
(
SELECT start = CONVERT(BIGINT, 1),
stop = CHARINDEX(@Delimiter, @CSV + @Delimiter)
UNION ALL
SELECT start = stop + 1,
stop = CHARINDEX(@Delimiter, @CSV + @Delimiter, stop + 1)
FROM csvtbl
WHERE stop > 0
)
INSERT INTO @Result
SELECT CAST(SUBSTRING(@CSV, start, CASE WHEN stop > 0 THEN stop - start ELSE 0 END) AS INT) AS [Value]
FROM csvtbl
WHERE stop > 0
OPTION (MAXRECURSION 1000)
RETURN
END
答案 4 :(得分:0)
感谢您的投入,我不得不承认,在开始工作之前,我做了一些糟糕的研究。我发现Erland Sommarskog已经在他的网页上写了很多这个问题,在你的回复之后和读完他的页面之后,我决定尝试制作一个CLR来解决这个问题。
我尝试了一个递归查询,这导致了良好的性能,但无论如何我都会尝试CLR功能。