SQL Server尝试将不可打印的字符保留为字符串格式功能

时间:2018-10-20 23:06:20

标签: sql-server regex

我正在使用SQL Server 2008 R2和SSMS版本17.6。我正在尝试编写一个函数,该函数可以传递一个字符串,并向该函数传递“保持”命令,该命令将用于生成一个正则表达式,以指定要保留在字符串中的字符组,以便其他所有内容都被删除。到目前为止,这是我的代码:

ALTER FUNCTION [dbo].[fn_KeepChars]
    (@String NVARCHAR(MAX),  
     @Keep VARCHAR(100))
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE 
        @MatchExpression NVARCHAR(200),
        @Alpha NVARCHAR(1000) = 'a-zA-Z',
        @Numeric NVARCHAR(1000) = '0-9',            
        @PrintSpecial NVARCHAR(1000) = '!-/' --+ CHAR(33) + '-' + CHAR(47) 
                                       + ':-@' --+ CHAR(58) + '-' + CHAR(64) 
                                       + '[-`' -- + CHAR(91) + '-' + CHAR(96)
                                       + '{-~' --+ CHAR(123) + '-' + CHAR(126)
                                       + 'Ç-■' --+ CHAR(128) + '-' + CHAR(255),
        -- @NonPrintSpecial NVARCHAR(100) = CHAR(1)+'-'+CHAR(31),
        @NonPrintSpecial NVARCHAR(100) = 'CHAR(1)-CHAR(31)',
        @AllSpecial NVARCHAR(100) 

    SET @AllSpecial = @NonPrintSpecial + @PrintSpecial

    SELECT 
        @MatchExpression = CASE LOWER(@Keep)
                              WHEN 'alpha' THEN @Alpha
                              WHEN 'num' THEN @Numeric
                              WHEN 'allspec' THEN @AllSpecial
                              WHEN 'printspec' THEN @PrintSpecial
                              WHEN 'nonprintspec' THEN @NonPrintSpecial
                              WHEN 'alphanum' THEN @Alpha + @Numeric
                              WHEN 'alphaallspec' THEN @Alpha + @AllSpecial
                              WHEN 'alphaprintspec' THEN @Alpha + @PrintSpecial
                              WHEN 'alphanonprintspec' THEN @Alpha + @NonPrintSpecial
                              WHEN 'numallspec' THEN @Numeric + @AllSpecial
                              WHEN 'numprintspec' THEN @Numeric + @PrintSpecial
                              WHEN 'numnonprintspec' THEN @Numeric + @NonPrintSpecial
                              WHEN 'alphanumprintspec' THEN @Alpha + @Numeric + @PrintSpecial
                              WHEN 'alphanumnonprintspec' THEN @Alpha + @Numeric + @NonPrintSpecial
                              ELSE 'INVALID_KEEP_PARAMETER_PASSED'
                           END 

    IF CHARINDEX('INVALID_KEEP_PARAMETER_PASSED',@MatchExpression) > 0
        RETURN 'INVALID_KEEP_PARAMETER_PASSED'

    SET @MatchExpression = '%[^' + @MatchExpression + ']%'

    WHILE PATINDEX(@MatchExpression, @String) > 0
        SET @String = STUFF(@String, PATINDEX(@MatchExpression, @String), 1, '')

    RETURN CONVERT(NVARCHAR,GETDATE()) + '      ' + @String
END

唯一不起作用的是为匹配不可打印字符而生成的表达式,该表达式存储在@NonPrintSpecial变量中。我只能想到两种定义方式:

@NonPrintSpecial NVARCHAR(100) = CHAR(1)+'-'+CHAR(31)

@NonPrintSpecial NVARCHAR(100) = 'CHAR(1)-CHAR(31)'

这是我用来测试该功能的地方:

DECLARE @String NVARCHAR(MAX),
        @String2 NVARCHAR(MAX),
        @MatchExpression NVARCHAR(MAX)
BEGIN
    SET @String = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789!@#$%^&*()_+=-`/*-+.\|]}[{/?''"' + CHAR(10)
    SELECT @String AS 'Original string'
    SELECT 'POS OF CHAR(10) in @String= ' + CONVERT(NVARCHAR,CHARINDEX(CHAR(10),@String)) AS 'Char(10) Position in original string'

    SELECT @String2 = dbo.fn_KeepChars(@String,'allspec') 
    SELECT @String2 AS 'Modified string'
    SELECT 'POS OF CHAR(10) in modified string= ' + CONVERT(NVARCHAR,CHARINDEX(CHAR(10),@String2)) AS 'Char(10) Position in modified string'
END

如果我在函数@NonPrintSpecial NVARCHAR(100) = CHAR(1)+'-'+CHAR(31)中使用此变量定义,然后运行测试,则输出将显示仅可打印的特殊字符保留(良好),但将要保留的CHAR(10)去除了。 / p>

如果我在函数@NonPrintSpecial NVARCHAR(100) = 'CHAR(1)-CHAR(31)'中使用此变量定义,然后运行测试,输出将显示CHAR(10)仍然被删除,除了之前存在的特殊字符以外,这些字符< / p>

  

abchrABCHR123456789

也在修改后的字符串中。

有人可以帮我解决我需要指定为@NonPrintSpecial变量定义的内容吗,以便我可以准确地将不可打印字符(如CHAR(10))保留在传入的@String中对我的功能?

1 个答案:

答案 0 :(得分:0)

我想出了我的问题的答案。感谢@AaronBertrand让我走上正轨以解决这个问题。问题在于PATINDEX()是我函数的一部分,无法识别我正在定义的不可打印范围,该范围已添加到该函数构建的正则表达式中。经过大量的尝试之后,我尝试将不可打印的ASCII字符(CHAR(1)到CHAR(31))分别添加到@NonPrintSpecial变量中,并且它可以正常工作。我承认我认为这看起来不错,但确实有效。我愿意打赌,比我有更多正则表达式经验的人将能够提供一个更优雅的答案,但这就是我能想到的:

ALTER FUNCTION [dbo].[fn_KeepChars]
(@String NVARCHAR(MAX),  
@Keep VARCHAR(100))
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE 
        @MatchExpression NVARCHAR(200),
        @Alpha NVARCHAR(1000) = 'a-zA-Z',
        @Numeric NVARCHAR(1000) = '0-9',            
        @PrintSpecial NVARCHAR(1000) = '!-/' --+ CHAR(33) + '-' + CHAR(47) 
                                       + ':-@' --+ CHAR(58) + '-' + CHAR(64) 
                                       + '[-`' -- + CHAR(91) + '-' + CHAR(96)
                                       + '{-~' --+ CHAR(123) + '-' + CHAR(126)
                                       + 'Ç-■', --+ CHAR(128) + '-' + CHAR(255)
        @NonPrintSpecial NVARCHAR(100) =  CHAR(1) + CHAR(2) + CHAR(3) + CHAR(4) + 
                                          CHAR(5) + CHAR(6) + CHAR(7) + CHAR(8) + 
                                          CHAR(9) + CHAR(10) + CHAR(11) + CHAR(12) + 
                                          CHAR(13) + CHAR(14) + CHAR(15) + CHAR(16) + 
                                          CHAR(17) + CHAR(18) + CHAR(19) + CHAR(20) + 
                                          CHAR(21) + CHAR(22) + CHAR(23) + CHAR(24) + 
                                          CHAR(25) + CHAR(26) + CHAR(27) + CHAR(28) + 
                                          CHAR(29) + CHAR(30) + CHAR(31),
        @AllSpecial NVARCHAR(100) 

    SET @AllSpecial = @PrintSpecial + @NonPrintSpecial

    SELECT 
        @MatchExpression = CASE LOWER(@Keep)
                              WHEN 'alpha' THEN @Alpha
                              WHEN 'num' THEN @Numeric
                              WHEN 'allspec' THEN @AllSpecial
                              WHEN 'printspec' THEN @PrintSpecial
                              WHEN 'nonprintspec' THEN @NonPrintSpecial
                              WHEN 'alphanum' THEN @Alpha + @Numeric
                              WHEN 'alphaallspec' THEN @Alpha + @AllSpecial
                              WHEN 'alphaprintspec' THEN @Alpha + @PrintSpecial
                              WHEN 'alphanonprintspec' THEN @Alpha + @NonPrintSpecial
                              WHEN 'numallspec' THEN @Numeric + @AllSpecial
                              WHEN 'numprintspec' THEN @Numeric + @PrintSpecial
                              WHEN 'numnonprintspec' THEN @Numeric + @NonPrintSpecial
                              WHEN 'alphanumprintspec' THEN @Alpha + @Numeric + @PrintSpecial
                              WHEN 'alphanumnonprintspec' THEN @Alpha + @Numeric + @NonPrintSpecial
                              ELSE 'INVALID_KEEP_PARAMETER_PASSED'
                           END 

    IF CHARINDEX('INVALID_KEEP_PARAMETER_PASSED',@MatchExpression) > 0
        RETURN 'INVALID_KEEP_PARAMETER_PASSED'

    SET @MatchExpression = '%[^' + @MatchExpression + ']%'

    WHILE PATINDEX(@MatchExpression, @String) > 0
        SET @String = STUFF(@String, PATINDEX(@MatchExpression, @String), 1, '')

    RETURN @String
END

现在我可以运行此测试,输出显示修改后的字符串中回车符(CHAR(10))的位置为25而不是0,这意味着我添加到字符串中的不可打印字符已经保持。

DECLARE @String NVARCHAR(MAX),
        @String2 NVARCHAR(MAX)
BEGIN
    SET @String = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789!@#$%^&*()_+=-`/*-+.\|]}[{/?''"' + CHAR(10)+CHAR(13)
    SELECT @String AS 'Original string'
    SELECT 'POS OF CHAR(10) in @String= ' + CONVERT(NVARCHAR,CHARINDEX(CHAR(10),@String)) AS 'Char(10) Position in original string'

    SELECT @String2 = dbo.fn_KeepChars(@String,'allspec') 
    SELECT @String2 AS 'Modified string'
    SELECT 'POS OF CHAR(10) in modified string= ' + CONVERT(NVARCHAR,CHARINDEX(CHAR(10),@String2)) AS 'Char(10) Position in modified string'
END