SQL-提取字符之间的文本

时间:2018-07-17 19:55:02

标签: sql sql-server tsql substring charindex

这就是我的数据。 (我正在尝试输入准确的电子邮件地址,以便可以向TO和CC人员发送电子邮件。)

    EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]             
    EmailCC:[url=mailto:Test_Email_2@Yahoo.com] Test_Email_2@Yahoo.com[/url]           

    Hello, This is the rest of the email message....

运行第一个SQL时,我得到了想要的结果。

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail

这将返回

    ToEmaiL = Test_Email_1@Yahoo.com

但是当我尝试进行第二次这样的SUBSTRING

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
    SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailCC',Body)-20) CCEmail --(Simply replacing the EmailTo from the previous line to EmailCC)
    From hdIssues   

我收到此错误

    "Msg 537, Level 16, State 5, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function."

感谢您的帮助。

P.S。在我的数据集中,电子邮件地址可以有多个收件人,中间用分号隔开:

[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]; [url=mailto:Test_Email_5@Yahoo.com] Test_Email_5@Yahoo.com[/url]; [url=mailto:Test_Email_8@Yahoo.com] Test_Email_8@Yahoo.com[/url]

3 个答案:

答案 0 :(得分:1)

我会使用regexp_substr

with t1(col) as(
   select 'EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]' from dual
)

select regexp_substr(col, '[[:alnum:]._%-]+@[[:alnum:]._%-]+\.com') as res
  from t1;

这将拉出两个电子邮件地址,这是我留下的,因为您在PS中说可能存在多个电子邮件地址。您可以修改正则表达式以仅提取每封电子邮件的单个副本。

答案 1 :(得分:1)

如果对TVF开放

示例

Select A.ID
      ,B.*
 From  YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B

返回

ID  RetSeq  RetPos  RetVal
1   1       23      Test_Email_1@Yahoo.com
1   2       89      Test_Email_5@Yahoo.com
1   3       155     Test_Email_8@Yahoo.com
1   4       229     Test_Email_2@Yahoo.com

TVF(如果有兴趣)

CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table 
As
Return (  

with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
       cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
       cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
       cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
      ,RetPos = N
      ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1) 
 From  (
        Select *,RetVal = Substring(@String, N, L) 
         From  cte4
       ) A
 Where charindex(@Delimiter2,RetVal)>1

)
/*
Max Length of String 1MM characters

Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
  

编辑-适用于身体

两个分度符是'[/url]''|||'。然后,我们通过添加唯一的字符串来强制结束定界符。在这种情况下,我选择了|||

如果您不想要多记录。删除CROSS APPLY B

示例

Select A.ID
      ,B.*
      ,Body = ltrim(rtrim(C.RetVal))
 From  @YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
 Cross Apply [dbo].[tvf-Str-Extract](A.Body+'|||','[/url]','|||') C  --- Notice A.Body+'|||'.... this is to force an ending delimiter

返回

enter image description here

答案 2 :(得分:0)

要解决您的查询问题,您需要在第一个']'字符后开始搜索EmailCC。否则,您将选择出现在“ EmailCC”之前的“]”字符的第一个出现位置,并因此出现错误。您可以通过为CHARINDEX()添加“ start_location”来实现此目的。

因此,将查询更改为以下内容:

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
    SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body, CHARINDEX('EmailCC', Body))-CHARINDEX('EmailCC',Body)-20) CCEmail
    From hdIssues

在此处查看文档:{​​{3}}