将SQL函数转换为存储过程

时间:2017-07-19 18:37:21

标签: sql-server function stored-procedures

我无法将UDF转换为存储过程。

以下是我所拥有的:这是调用该函数的存储过程(我用它来搜索和删除不在32和126之间的所有UNICODE字符):

ALTER PROCEDURE [dbo].[spRemoveUNICODE] 
    @FieldList varchar(250) = '', 
    @Multiple int = 0,
    @TableName varchar(100) = ''
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @SQL VARCHAR(MAX), @counter INT = 0

    IF @Multiple > 0
    BEGIN
        DECLARE @Field VARCHAR(100)

        SELECT splitdata 
        INTO #TempValue 
        FROM dbo.fnSplitString(@FieldList,',')

        WHILE (SELECT COUNT(*) FROM #TempValue) >= 1
        BEGIN
            DECLARE @Column VARCHAR(100) = (SELECT TOP 1 splitdata FROM #TempValue)

            SET @SQL = 'UPDATE ' + @TableName + ' SET ' + @Column + ' = dbo.RemoveNonASCII(' + @Column + ')'

            EXEC (@SQL)
            --print @SQL

            SET @counter = @counter + 1

            PRINT @column + ' was checked for ' + @counter + ' rows.'

            DELETE FROM #TempValue
            WHERE splitdata = @Column
        END
    END
    ELSE IF @Multiple = 0
    BEGIN
        SET @SQL = 'UPDATE ' + @TableName + ' SET ' + @FieldList + ' = dbo.RemoveNonASCII(' + @FieldList + ')'

        EXEC (@SQL)
        --print @SQL

        SET @counter = @counter + 1

        PRINT @column + ' was checked for ' + @counter + ' rows.'
    END
END

这是我为帮助更新而创建的UDF(RemoveNonASCII):

ALTER FUNCTION [dbo].[RemoveNonASCII] 
    (@nstring nvarchar(max))
RETURNS varchar(max)
AS
BEGIN
    -- Variables
    DECLARE @Result varchar(max) = '',@nchar nvarchar(1), @position int

    -- T-SQL statements to compute the return value
    set @position = 1
    while @position <= LEN(@nstring)
    BEGIN
        set @nchar = SUBSTRING(@nstring, @position, 1)
        if UNICODE(@nchar) between 32 and 127
            set @Result = @Result + @nchar
        set @position = @position + 1
        set @Result = REPLACE(@Result,'))','')
        set @Result = REPLACE(@Result,'?','')
    END
    if (@Result = '')
    set @Result = null
    -- Return the result
    RETURN @Result

END

我一直试图将其转换为存储过程。我想跟踪运行时实际更新的行数。现在它只是说所有行,无论我运行多少,都会更新。我想知道是否只有一半的人有坏人物。已经设置了存储过程,以便它告诉我它正在查看哪一列,我想要包括更新了多少行。这是我到目前为止所尝试的内容:

DECLARE @Result varchar(max) = '',@nchar nvarchar(1), @position int, @nstring nvarchar(max), @counter int = 0, @CountRows int = 0, @Length int
--select Notes from #Temp where Notes is not null order by Notes OFFSET @counter ROWS FETCH NEXT 1 ROWS ONLY
set @nstring = (select Notes from #Temp where Notes is not null order by Notes OFFSET @counter ROWS FETCH NEXT 1 ROWS ONLY)
set @Length = LEN(@nstring)
if @Length = 0 set @Length = 1
-- Add the T-SQL statements to compute the return value here
set @position = 1
while @position <= @Length
BEGIN
    print @counter
    print @CountRows
    select @nstring
    set @nchar = SUBSTRING(@nstring, @position, 1)
    if UNICODE(@nchar) between 32 and 127
    begin
        print unicode(@nchar)
        set @Result = @Result + @nchar
        set @counter = @counter + 1
    end
    if UNICODE(@nchar) not between 32 and 127
    begin
        set @CountRows = @CountRows + 1
    end
    set @position = @position + 1
END
print 'Rows found with invalid UNICODE: ' + convert(varchar,@CountRows)

现在我故意创建一个临时表并添加一堆笔记,然后添加一堆无效字符。

我创建了一个包含700多个Notes的列表,然后使用一些无效字符(在32 - 127之外)更新了其中的两个。有一些是null,一些不是null,但它们中没有任何东西。会发生什么是我得到0更新。

  

找到无效UNICODE的行:0

虽然它确实看到它所引用的UNICODE是32。

显然我错过了一些我不知道它是什么的东西。

2 个答案:

答案 0 :(得分:2)

以下是基于设置的解决方案,用于处理批量替换。这不是使用缓慢的标量函数,而是使用内联表值函数。这些比他们的标量祖先要快得多。我在这里使用计数表。我把它作为我的系统的视图就像这样。

create View [dbo].[cteTally] as

WITH
    E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS 
    (
        SELECT  ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    )
select N from cteTally

如果您对计数表感兴趣,这里有一篇关于该主题的优秀文章。 http://www.sqlservercentral.com/articles/T-SQL/62867/

create function RemoveNonASCII
(
    @SearchVal nvarchar(max)
) returns table as 
    RETURN
    with MyValues as
    (
        select substring(@SearchVal, N, 1) as MyChar
            , t.N
        from cteTally t 
        where N <= len(@SearchVal)
            and UNICODE(substring(@SearchVal, N, 1)) between 32 and 127 
    )

    select distinct MyResult = STUFF((select MyChar + ''
                    from MyValues mv2
                    order by mv2.N
                    --for xml path('')), 1, 0, '')
                    FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'), 1, 0, '')
        from MyValues mv
    ;

现在,您可以使用交叉申请,而不是被迫每一行调用它。原始问题的这一部分的性能优势应该非常大。

我也说过你的字符串拆分器也是一个潜在的性能问题。这是一篇很棒的文章,里面有很多基于快速设置的字符串分割器。 http://sqlperformance.com/2012/07/t-sql-queries/split-strings

这里的最后一步是消除程序中的第一个循环。这也可以做,但我不完全确定你的代码在那里做什么。我会仔细观察,看看能找到什么。在此期间,您可以通过此解析并随意提出有关您不理解的任何部分的问题。

答案 1 :(得分:0)

以下是我在Sean Lange的帮助下开展的工作:

我如何调用存储过程:

exec spRemoveUNICODE @FieldList='Notes,Notes2,Notes3,Notes4,Notes5',@Multiple=1,@TableName='#Temp'

创建了#Temp表:

create table #Temp (ID int,Notes nvarchar(Max),Notes2 nvarchar(max),Notes3 nvarchar(max),Notes4 nvarchar(max),Notes5 nvarchar(max))

然后我用来自几个不同表格的5个字段的注释填充它,其长度范围从NULL到空白(但不是空)到5000个字符。

然后我插入一些像这样的随机字符:

update #Temp
set Notes2 = SUBSTRING(Notes2,1,LEN(Notes2)/2) + N'㹊潮Ņ᯸ࢹᖈư㹨ƶ槹鎤⻄ƺ綐ڌ⸀ƺ삸)䀤ƍ샄)Ņᛡ鎤ꗘᖃᒨ쬵Ğᘍ鎤ᐜᏰ>֔υ赸Ƹ쳰డ촜)鉀௿촜)쮜)Ἡ屰山舰霡ࣆ 耏Аం畠Ư놐ᓜતᏛ֔Ꮫ֨Ꮫ꯼ᓜƒ 邰఍厰ఆ邰఍드)抉鎤듄)繟Ĺ띨)᯸ࢹ䮸ࣉ᯸ࢹ䮸ࣉ샰)ԌƏŅ֐ᕄ홑Ņᛙ鎤ꗘᖃᒨ᯸ࢹ' + SUBSTRING(Notes2,LEN(Notes2)/2-1,LEN(Notes2)/2)

我为5列中的每一列都这样做。

以下是spRemoveUNICODE现在的样子:

ALTER PROCEDURE [dbo].[spRemoveUNICODE] 
    -- Parameters
    @FieldList varchar(250) = '', 
    @Multiple int = 0,
    @TableName varchar(100) = ''
AS
BEGIN
    SET NOCOUNT ON;
    -- Variables
    declare @SQL varchar(max)
    -- Insert statements for procedure here
    if @Multiple > 0
    BEGIN
        declare @Field varchar(100)
        select Item into #TempValue from dbo.SplitStrings_Numbers(@FieldList,',')
        while (select count(*) from #TempValue) >= 1
        BEGIN
            declare @Column varchar(100) = (select top 1 Item from #TempValue)
            set @SQL = 'UPDATE ' + @TableName + ' SET ' + @Column + ' = tt.Result
                        from ' + @TableName + ' t
                        join (select ID,(select REPLACE(REPLACE(REPLACE(REPLACE(MyResult,''))'',''''),''>)'',''''),'' N>)   N'',''''),'' N   N'','''') 
                                from dbo.RemoveNonASCII_New(' + @Column + ')) Result from ' + @TableName + ') tt on t.ID = tt.ID'

            exec (@SQL)
            --print @SQL --for trouble shooting

            print @column + ' was checked.'
            delete from #TempValue
            from #TempValue
            where Item = @Column
        END
    END
    else if @Multiple = 0
    BEGIN
        set @SQL = 'UPDATE ' + @TableName + ' SET ' + @FieldList + ' = tt.Result
                        from ' + @TableName + ' t
                        join (select ID,(select REPLACE(REPLACE(REPLACE(REPLACE(MyResult,''))'',''''),''>)'',''''),'' N>)   N'',''''),'' N   N'','''') 
                                from dbo.RemoveNonASCII_New(' + @FieldList + ')) Result from ' + @TableName + ') tt on t.ID = tt.ID'

        exec (@SQL)
        --print @SQL --for trouble shooting

        print @column + ' was checked.'
    END
END

以下是新的SplitStrings_Numbers函数,它将列列表拆分为各个列名:

ALTER FUNCTION [dbo].[SplitStrings_Numbers]
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN
   (
       SELECT Item = SUBSTRING(@List, Number, 
         CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
       FROM dbo.Numbers
       WHERE Number <= CONVERT(INT, LEN(@List))
         AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
   );

我创建了Numbers表,如下所示:

DECLARE @UpperLimit INT = 1000000;

WITH n AS
(
    SELECT
        x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM       sys.all_objects AS s1
    CROSS JOIN sys.all_objects AS s2
    CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
  INTO dbo.Numbers
  FROM n
  WHERE x BETWEEN 1 AND @UpperLimit;

GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number) 
    WITH (DATA_COMPRESSION = PAGE);
GO

然后最后搜索Notes并删除无效的UNICODE,就像使用RemoveNonASSCII_New函数一样:

ALTER function [dbo].[RemoveNonASCII_New]
(
    @SearchVal nvarchar(max)
) returns table as 
    RETURN
    with MyValues as
    (
        select substring(@SearchVal, Number, 1) as MyChar
            , t.Number
        from Numbers t 
        where Number <= len(@SearchVal)
            and UNICODE(substring(@SearchVal, Number, 1)) between 32 and 127 
    )

    select distinct MyResult = STUFF((select MyChar + ''
                    from MyValues mv2
                    order by mv2.Number
                    FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'), 1, 0, '')
        from MyValues mv;

我在原始问题中这样做的方式花费了60多分钟来清除所有5列。使用这种新方法,清除相同的5列需要1.5分钟。每列中有超过11000行添加了无效字符。