SQL替换函数里面的正则表达式模式?

时间:2014-01-27 10:16:56

标签: sql-server regex

SELECT REPLACE('<strong>100</strong><b>.00 GB', '%^(^-?\d*\.{0,1}\d+$)%', '');

我想用上面的正则表达式替换数字的两个部分之间的任何标记,但它似乎不起作用。我不确定它是否是正则表达式语法是错误的,因为我尝试了更简单的语法,如'%[^0-9]%'只是为了测试,但它也没有用。有谁知道我怎么能实现这个目标?

12 个答案:

答案 0 :(得分:52)

您可以使用PATINDEX 找到模式(字符串)出现的第一个索引。然后使用STUFF将另一个字符串填充到匹配的模式(字符串)中。

遍历每一行。用你想要的东西替换每个非法字符。在您的情况下,将非数字替换为空白。内部循环是指当前单元格中有多个非法字符。

DECLARE @counter int

SET @counter = 0

WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table))
BEGIN  

    WHILE 1 = 1
    BEGIN
        DECLARE @RetVal varchar(50)

        SET @RetVal =  (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')
        FROM Table
        WHERE ID_COLUMN = @counter)

        IF(@RetVal IS NOT NULL)       
          UPDATE Table SET
          Column = @RetVal
          WHERE ID_COLUMN = @counter
        ELSE
            break
    END

    SET @counter = @counter + 1
END

警告:虽然这很慢!拥有varchar列可能会产生影响。所以使用LTRIM RTRIM可能会有所帮助。无论如何,它很慢。

积分转到this StackOverFlow回答。

EDIT 还可以转到@srutzky

编辑(由@Tmdean撰写) 这个答案可以适应更基于集合的解决方案,而不是一次一行。它仍然迭代一行中非数字字符数的最大值,因此它并不理想,但我认为在大多数情况下它应该是可接受的。

WHILE 1 = 1 BEGIN
    WITH q AS
        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
        FROM Table)
    UPDATE Table
    SET Column = STUFF(Column, q.n, 1, '')
    FROM q
    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 BREAK;
END;

如果您在表中维护一个表示该字段是否已被清除的列,您还可以提高效率。 (在我的示例中,NULL表示“未知”,应该是列默认值。)

DECLARE @done bit = 0;
WHILE @done = 0 BEGIN
    WITH q AS
        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
        FROM Table
        WHERE COALESCE(Scrubbed_Column, 0) = 0)
    UPDATE Table
    SET Column = STUFF(Column, q.n, 1, ''),
        Scrubbed_Column = 0
    FROM q
    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 SET @done = 1;

    -- if Scrubbed_Column is still NULL, then the PATINDEX
    -- must have given 0
    UPDATE table
    SET Scrubbed_Column = CASE
        WHEN Scrubbed_Column IS NULL THEN 1
        ELSE NULLIF(Scrubbed_Column, 0)
    END;
END;

如果您不想更改架构,则很容易适应将中间结果存储在表值变量中,该变量将在最后应用于实际表。

答案 1 :(得分:23)

一般而言,SQL Server不支持正则表达式,您不能在本机T-SQL代码中使用它们。

您可以编写CLR函数来执行此操作。例如,请参阅here

答案 2 :(得分:17)

使用Replace(Column, BadFoundCharacter, '')而不是通过其唯一位置剥离找到的角色可能会快得多。此外,不是仅替换每列中的下一个坏字符,而是替换所有找到的字符。

WHILE 1 = 1 BEGIN
    UPDATE dbo.YourTable
    SET Column = Replace(Column, Substring(Column, PatIndex('%[^0-9.-]%', Column), 1), '')
    WHERE Column LIKE '%[^0-9.-]%'
    If @@RowCount = 0 BREAK;
END;

我确信这比接受的答案更有效,只是因为它的操作更少。还有其他方法也可能更快,但我现在没时间探索这些方法。

答案 3 :(得分:3)

这是我根据以前的答案写的一个函数来完成这个。

CREATE FUNCTION dbo.RepetitiveReplace
(
    @P_String VARCHAR(MAX),
    @P_Pattern VARCHAR(MAX),
    @P_ReplaceString VARCHAR(MAX),
    @P_ReplaceLength INT = 1
)
RETURNS VARCHAR(MAX)
BEGIN
    DECLARE @Index INT;

    -- Get starting point of pattern
    SET @Index = PATINDEX(@P_Pattern, @P_String);

    while @Index > 0
    begin
        --replace matching charactger at index
        SET @P_String = STUFF(@P_String, PATINDEX(@P_Pattern, @P_String), @P_ReplaceLength, @P_ReplaceString);
        SET @Index = PATINDEX(@P_Pattern, @P_String);
    end

    RETURN @P_String;
END;

Gist

编辑:

最初我在这里有一个递归函数,它不能很好地与sql server一起使用,因为它有32个嵌套级别限制,当你尝试用函数进行32次替换时会导致类似下面的错误。而不是尝试进行服务器级别更改以允许更多嵌套(这可能是危险的,如允许永不结束循环)切换到while循环更有意义。

超出最大存储过程,函数,触发器或视图嵌套级别(限制32)。

答案 4 :(得分:3)

我偶然发现了这篇文章寻找其他的东西,但我想我会提到一个更有效的解决方案 - 当与基于集合的查询一起使用时,它应该是任何函数的默认实现 - 这是使用交叉应用表函数。似乎该主题仍然有效,所以希望这对某人有用。

基于运行基于递归集的查询或标量函数的一些答案的示例运行时,基于1m行测试集从随机newid中删除字符,WHILE循环示例的范围从34s到2m05s函数示例的1m3s到{forever}。

使用具有交叉应用的表格功能可在 10s 中实现相同的目标。您可能需要调整它以满足您的需要,例如它处理的最大长度。

功能:

CREATE FUNCTION [dbo].[RemoveChars](@InputUnit VARCHAR(40))
RETURNS TABLE
AS
RETURN
    (
        WITH Numbers_prep(Number) AS
            (
                SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
            )
        ,Numbers(Number) AS
            (
                SELECT TOP (ISNULL(LEN(@InputUnit),0))
                    row_number() OVER (ORDER BY (SELECT NULL))
                FROM Numbers_prep a
                    CROSS JOIN Numbers_prep b
            )
        SELECT
            OutputUnit
        FROM
            (
                SELECT
                    substring(@InputUnit,Number,1)
                FROM  Numbers
                WHERE substring(@InputUnit,Number,1) like '%[0-9]%'
                ORDER BY Number
                FOR XML PATH('')
            ) Sub(OutputUnit)
    )

用法:

UPDATE t
SET column = o.OutputUnit
FROM ##t t
CROSS APPLY [dbo].[RemoveChars](t.column) o

答案 5 :(得分:2)

如果要重用它,将解决方案包装在SQL函数中可能很有用。 我甚至在细胞水平上做这个,这就是为什么我把它作为一个不同的答案:

SELECT
  u.user_id as user_id,
  date(u.created) as signup_date,
  cal.date as date,
from (select date(dt) as date from [dw.calendar] where date(dt) < 
CURRENT_DATE() ) cal
  cross join each dw.user u
where
  date(u.created) <= cal.date

答案 6 :(得分:2)

对于那些寻求高性能,简单解决方案并愿意启用CLR的人:

create database TestSQLFunctions
go
use TestSQLFunctions
go
alter database TestSQLFunctions set trustworthy on

EXEC sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
go

CREATE ASSEMBLY [SQLFunctions]
AUTHORIZATION [dbo]
FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C0103004BE8B85F0000000000000000E00022200B013000000800000006000000000000C2270000002000000040000000000010002000000002000004000000000000000600000000000000008000000002000000000000030060850000100000100000000010000010000000000000100000000000000000000000702700004F000000004000009803000000000000000000000000000000000000006000000C00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E74657874000000C8070000002000000008000000020000000000000000000000000000200000602E72737263000000980300000040000000040000000A0000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000000E00000000000000000000000000004000004200000000000000000000000000000000A4270000000000004800000002000500682000000807000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003A022D02022A020304281000000A2A1E02281100000A2A0042534A4201000100000000000C00000076342E302E33303331390000000005006C00000018020000237E000084020000CC02000023537472696E6773000000005005000004000000235553005405000010000000234755494400000064050000A401000023426C6F620000000000000002000001471500000900000000FA0133001600000100000013000000020000000200000003000000110000000F00000001000000030000000000CA01010000000000060025014F02060092014F02060044001D020F006F02000006006C00E20106000801E2010600D400E20106007901E20106004501E20106005E01E20106008300E2010600580030020600360030020600B700E20106009E00B0010600AA02DB010A00F300FC010A001F00FC010E00C3027E02000000000100000000000100010001001000C3029D0241000100010050200000000096002E001A0001005F2000000000861817020600040000000100BD0200000200F40100000300B102090017020100110017020600190017020A0029001702100031001702100039001702100041001702100049001702100051001702100059001702100061001702150069001702100071001702100079001702100089001702060099002E001A0081001702060020007B0010012E000B002A002E00130033002E001B0052002E0023005B002E002B006D002E0033006D002E003B006D002E0043005B002E004B0073002E0053006D002E005B006D002E0063008B002E006B00B5002E007300C2000480000001000000000000000000000000009D020000040000000000000000000000210016000000000004000000000000000000000021000A00000000000400000000000000000000002100DB010000000000000000003C4D6F64756C653E0053797374656D2E44617461006D73636F726C696200446174614163636573734B696E64005265706C61636500477569644174747269627574650044656275676761626C6541747472696275746500436F6D56697369626C6541747472696275746500417373656D626C795469746C6541747472696275746500417373656D626C7954726164656D61726B417474726962757465005461726765744672616D65776F726B41747472696275746500417373656D626C7946696C6556657273696F6E41747472696275746500417373656D626C79436F6E66696775726174696F6E4174747269627574650053716C46756E6374696F6E41747472696275746500417373656D626C794465736372697074696F6E41747472696275746500436F6D70696C6174696F6E52656C61786174696F6E7341747472696275746500417373656D626C7950726F6475637441747472696275746500417373656D626C79436F7079726967687441747472696275746500417373656D626C79436F6D70616E794174747269627574650052756E74696D65436F6D7061746962696C6974794174747269627574650053797374656D2E52756E74696D652E56657273696F6E696E670053514C46756E6374696F6E732E646C6C0053797374656D0053797374656D2E5265666C656374696F6E007061747465726E004D6963726F736F66742E53716C5365727665722E536572766572002E63746F720053797374656D2E446961676E6F73746963730053797374656D2E52756E74696D652E496E7465726F7053657276696365730053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300446562756767696E674D6F6465730053797374656D2E546578742E526567756C617245787072657373696F6E730053514C46756E6374696F6E73004F626A656374007265706C6163656D656E7400696E70757400526567657800000000000000003A1617E607071B47B964858BCD87458B00042001010803200001052001011111042001010E04200101020600030E0E0E0E08B77A5C561934E0890801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F7773010801000200000000001101000C53514C46756E6374696F6E73000005010000000017010012436F7079726967687420C2A920203230323000002901002434346436386231632D393735312D343938612D396665352D32316666333934303738303900000C010007312E302E302E3000004D01001C2E4E45544672616D65776F726B2C56657273696F6E3D76342E352E320100540E144672616D65776F726B446973706C61794E616D65142E4E4554204672616D65776F726B20342E352E32808F010001005455794D6963726F736F66742E53716C5365727665722E5365727665722E446174614163636573734B696E642C2053797374656D2E446174612C2056657273696F6E3D342E302E302E302C2043756C747572653D6E65757472616C2C205075626C69634B6579546F6B656E3D623737613563353631393334653038390A4461746141636365737301000000000000982700000000000000000000B2270000002000000000000000000000000000000000000000000000A4270000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF25002000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001000000018000080000000000000000000000000000001000100000030000080000000000000000000000000000001000000000048000000584000003C03000000000000000000003C0334000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE00000100000001000000000000000100000000003F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B0049C020000010053007400720069006E006700460069006C00650049006E0066006F0000007802000001003000300030003000300034006200300000001A000100010043006F006D006D0065006E007400730000000000000022000100010043006F006D00700061006E0079004E0061006D006500000000000000000042000D000100460069006C0065004400650073006300720069007000740069006F006E0000000000530051004C00460075006E006300740069006F006E00730000000000300008000100460069006C006500560065007200730069006F006E000000000031002E0030002E0030002E003000000042001100010049006E007400650072006E0061006C004E0061006D0065000000530051004C00460075006E006300740069006F006E0073002E0064006C006C00000000004800120001004C006500670061006C0043006F007000790072006900670068007400000043006F0070007900720069006700680074002000A90020002000320030003200300000002A00010001004C006500670061006C00540072006100640065006D00610072006B00730000000000000000004A00110001004F0072006900670069006E0061006C00460069006C0065006E0061006D0065000000530051004C00460075006E006300740069006F006E0073002E0064006C006C00000000003A000D000100500072006F0064007500630074004E0061006D00650000000000530051004C00460075006E006300740069006F006E00730000000000340008000100500072006F006400750063007400560065007200730069006F006E00000031002E0030002E0030002E003000000038000800010041007300730065006D0062006C0079002000560065007200730069006F006E00000031002E0030002E0030002E0030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000C000000C43700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
WITH PERMISSION_SET = SAFE

go

CREATE FUNCTION RegexReplace(
    @input nvarchar(max),
    @pattern nvarchar(max),
    @replacement nvarchar(max)
) RETURNS nvarchar  (max)
AS EXTERNAL NAME SQLFunctions.[SQLFunctions.Regex].Replace; 

go

-- outputs This is a test 
select dbo.RegexReplace('This is a test 12345','[0-9]','')

DLL的内容: enter image description here

答案 7 :(得分:2)

我认为这个解决方案更快更简单。我总是使用 CTE/递归,因为 mssql 上的速度太慢了。 我在我处理的项目和大型数据库中使用它。

/*
Function:           dbo.kSql_ReplaceRegExp
Create Date:        20.02.2021
Author:             Karcan Ozbal

Description:        The given string value will be replaced according to the given regexp/pattern.

Parameter(s):       @Value       : Value/Text to REPLACE.
                    @RegExp      : The regexp/pattern to be used for REPLACE operation.

Usage:              select dbo.kSql_ReplaceRegExp('2T3EST5','%[0-9]%')
Output:             'TEST'
*/
ALTER FUNCTION [dbo].[kSql_ReplaceRegExp](
    @Value nvarchar(max),
    @RegExp nvarchar(50)
)
RETURNS nvarchar(max)
AS
BEGIN
    DECLARE @Result nvarchar(max)

    ;WITH CTE AS (
        SELECT NUM = 1, VALUE = @Value, IDX = PATINDEX(@RegExp, @Value)
        UNION ALL
        SELECT NUM + 1, VALUE = REPLACE(VALUE, SUBSTRING(VALUE,IDX,1),''), IDX = PATINDEX(@RegExp, REPLACE(VALUE, SUBSTRING(VALUE,IDX,1),'')) 
        FROM CTE
        WHERE IDX > 0
    )
    SELECT TOP(1) @Result = VALUE 
    FROM CTE 
    ORDER BY NUM DESC
    OPTION (maxrecursion 0)

    RETURN @Result
END

答案 8 :(得分:1)

如果您只是为进入存储过程的参数执行此操作,则可以使用以下命令:

while PatIndex('%[^0-9]%', @Param) > 0
    select  @Param = Replace(@Param, Substring(@Param, PatIndex('%[^0-9]%', @Param), 1), '')

答案 9 :(得分:0)

我认为更简单,更快捷的方法是按字母表中的每个字符进行迭代:

DECLARE @i int
SET @i = 0

WHILE(@i < 256)
BEGIN  

    IF char(@i) NOT IN ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.')      

      UPDATE Table SET Column = replace(Column, char(@i), '')

    SET @i = @i + 1

END

答案 10 :(得分:0)

我创建了此函数来清理在时间字段中包含非数字字符的字符串。他们未添加分钟数时,时间中包含问号,例如20:??。函数遍历每个字符并替换?带有0:

 CREATE FUNCTION [dbo].[CleanTime]
(
    -- Add the parameters for the function here
    @intime nvarchar(10) 
)
RETURNS nvarchar(5)
AS
BEGIN
    -- Declare the return variable here
    DECLARE @ResultVar nvarchar(5)
    DECLARE @char char(1)
    -- Add the T-SQL statements to compute the return value here
    DECLARE @i int = 1
    WHILE @i <= LEN(@intime)
    BEGIN
    SELECT @char =  CASE WHEN substring(@intime,@i,1) like '%[0-9:]%' THEN substring(@intime,@i,1) ELSE '0' END
    SELECT @ResultVar = concat(@ResultVar,@char)   
    set @i  = @i + 1       
    END;
    -- Return the result of the function
    RETURN @ResultVar

END

答案 11 :(得分:0)

我认为这更清楚:

ALTER FUNCTION [dbo].[func_ReplaceChars](
    @Value nvarchar(max),
    @Chars nvarchar(50)
)
RETURNS nvarchar(max)
AS
BEGIN
    DECLARE @cLen int = len(@Chars);
    DECLARE @curChar int = 0;

    WHILE @curChar<@cLen
    BEGIN
        set @Value = replace(@Value,substring(@Chars,@curChar,1),'');

        set @curChar = @curChar + 1;
    END;

    RETURN @Value
END