SQL Server:用示例文本替换电子邮件但保持结构

时间:2012-04-20 00:04:47

标签: sql-server regex tsql

对于SQL Server 2005实例,使用多个电子邮件(例如

)查找/替换列的最佳方法是什么
<JimmyTheBoot@yahoo.com>; JohnBlaze@TestMail.com; comfarmer@yahoo.com .....

并将其替换为

<TestMail@yRandMail.com>; TestMail@RandMail.com; TestMail@RandMail.com .....

出于测试目的,我可以想一些在C#中执行此操作的方法,但我想知道是否有一种思考方式可以在SQL Server中执行此操作,可能使用REGEX?我想尽可能保持随机怪异(有些电子邮件有括号,有些电子邮件末尾有分号等等)。

由于

2 个答案:

答案 0 :(得分:2)

在这里,您可以使用功能中的cte来完成它。

create function dbo.FixupEmails(@s varchar(8000))
returns table
as
return (
      WITH splitter_cte AS (
      SELECT CHARINDEX(';', @s) as pos, 0 as lastPos, 1 as cte_level
      UNION ALL
      SELECT CHARINDEX(';', @s, pos + 1), pos, cte_level + 1 as cte_level
      FROM splitter_cte
      WHERE pos > 0
      ), each_email_cte AS(
      select replace(replace(replace(OneEmail, '>', ''), '<', ''), ' ', '') as OneEmail, cte_level
        from (select SUBSTRING(@s, lastPos + 1,
                         case when pos = 0 then 80000 else pos - lastPos -1 end) as OneEmail,
                         cte_level
                from splitter_cte) as t
      ), each_half_cte AS (
        select OneEmail, CHARINDEX('@', OneEmail) as atPos, cte_level
        from each_email_cte
        where len(OneEmail) > 6  -- 6 from x@x.co (I think that 6 would be the minimum valid email length)
      ), new_email_cte as
      (
        select cte1.OneEmail, Replace(@s, cte1.OneEmail, 'TestMail@RandMail.com') as New, cte1.cte_level --, 1 as level
        from each_half_cte cte1
        where cte1.cte_level = 1

        UNION ALL

        select cte2.OneEmail, Replace(necte.New, cte2.OneEmail, 'TestMail@RandMail.com') as New, cte2.cte_level--, 1 as level
        from new_email_cte as necte
        inner join each_half_cte as cte2 on cte2.cte_level = necte.cte_level + 1


      )
      select New
      from new_email_cte
      where cte_level = (select max(cte_level) from new_email_cte)
)
go

set nocount on;

declare @emailString varchar(2048)
set @emailString = '<JimmyTheBoot@yahoo.com>; JohnBlaze@TestMail.com; comfarmer@yahoo.com ';
select @emailString as Original;
SELECT *
  FROM dbo.FixupEmails(@emailString);




set @emailString = '<JimmyTheBoot@yahoo.com>; JohnBlaze@TestMail.com;';
select @emailString as Original;
SELECT *
  FROM dbo.FixupEmails(@emailString);


set @emailString = '<JimmyTheBoot@yahoo.com>';
select @emailString as Original;
SELECT *
  FROM dbo.FixupEmails(@emailString)
OPTION(MAXRECURSION 0);
-- include MAXRECURSION as shown above if you have more than 100 email addresses in the field.



set @emailString = '<bill@whatever.co.uk>; John@TestMail.tv;';
select @emailString as Original;
SELECT *
  FROM dbo.FixupEmails(@emailString)

有点长,但这是输出。

Original
----------------------------------------------------------------
<JimmyTheBoot@yahoo.com>; JohnBlaze@TestMail.com; comfarmer@yahoo.com 

New
-----------------------------------------------------------------
<TestMail@RandMail.com>; TestMail@RandMail.com; TestMail@RandMail.com 




Original
----------------------------------------------------------------
<JimmyTheBoot@yahoo.com>; JohnBlaze@TestMail.com;

New
----------------------------------------------------------------
<TestMail@RandMail.com>; TestMail@RandMail.com;





Original
----------------------------------------------------------------
<JimmyTheBoot@yahoo.com>

New
----------------------------------------------------------------
<TestMail@RandMail.com>





Original
----------------------------------------------------------------
<bill@whatever.co.uk>; John@TestMail.tv;

New
----------------------------------------------------------------
<TestMail@RandMail.com>; TestMail@RandMail.com;

这很有趣。我认为所提供的功能可以满足您的需求。

答案 1 :(得分:1)

一些建议:

  • SQL Server String Functions

    一种方法可能是:

    1. 找到@符号的索引
    2. 使用您的电子邮件ID替换之前的部分(直到上一个空格或其他字符 - 可能是设置[],;&lt;&gt;())
    3. 用您的域
    4. 替换它之后的部分(直到下一个空格或其他字符)
    5. 在列
    6. 中重复下一个@符号

      如果您碰巧更换了友好名称的一部分而不是电子邮件 标识符是偶然的,它不应该是重要的。

      使用CHARINDEX查找字符串中@符号的下一个索引。 使用PATINDEX查找特定模式的索引,例如 空格或其他分隔符。它可能更容易 字符串按节或按空格分割而不是处理 整个事情一下子。

    7. 编写正则表达式并设置SQL CLR函数来执行替换也可能更容易。

    8. 如果更换电子邮件地址的原因是为了避免发送 给他们发电子邮件,你可以设置一个调试标志/选项 应用。设置标志后,将电子邮件地址替换为 开发人员定义了地址或日志,但忽略了发送的电子邮件。