如何从sql server中的字符串中获取email-id?

时间:2012-07-16 10:21:48

标签: sql sql-server sql-server-2008 tsql

DECLARE @str as varchar(500) = '</FONT><FONT SIZE=2 FACE="Arial">atul.kale@bca.net</FONT><FONT SIZE=2 FACE="Arial">'
SELECT substring(@str,patindex('%">%',@str),patindex('%</FONT>%',@str))

我试图从@str字符串中获取电子邮件ID。

输出应为。

atul.kale@bca.net

我没有得到如何处理sql-server中的substring?

3 个答案:

答案 0 :(得分:2)

如果字符串"</FONT><FONT SIZE=2 FACE="Arial">"在电子邮件地址周围不变,那么您总是可以尝试这个

DECLARE @str as varchar(500) = '</FONT><FONT SIZE=2 FACE="Arial">atul.kale@bca.net</FONT><FONT SIZE=2 FACE="Arial">'
SELECT REPLACE(@str,'</FONT><FONT SIZE=2 FACE="Arial">','')

如果不是常数

DECLARE @str as varchar(500) = '</FONT><FONT SIZE=2 FACE="Arial">atul.kale@bca.net</FONT><FONT SIZE=2 FACE="Arial">'
select substring(@str,patindex('%">%',@str)+2,patindex('%</FONT>%',substring(@str,patindex('%">%',@str)+3,len(@str))))

答案 1 :(得分:0)

DECLARE @str as varchar(500) = '</FONT><FONT SIZE=2 FACE="Arial">atul.kale@bca.net</FONT><FONT SIZE=2 FACE="Arial">' 

select substring(col,1,charindex('<',col)-1) as email from
(
    select substring(@str,charindex('">',@str)+2,len(@str)) as col
) as t

答案 2 :(得分:0)

这是我尝试的方法:

  1. 从字符串
  2. 中删除HTML标记
  3. 利用Phil Factors script from PATINDEX Workbench从字符串中提取电子邮件地址。
  4. 即使字符串中有多个@符号,脚本也会忽略不属于电子邮件地址的字符串。此外,如果字符串中有多个电子邮件地址,则此脚本将仅获取第一个。

    --This script is NOT written by me. I have it in my laptop and at present i don't remember  who created this :(
    CREATE FUNCTION [dbo].[StripHTML]( @text VARCHAR(MAX) ) 
    RETURNS VARCHAR(MAX) 
    AS
    BEGIN
        DECLARE @textXML XML
        DECLARE @result VARCHAR(MAX)
        SET @textXML = @text;
        WITH doc(contents) AS
        (
            SELECT chunks.chunk.query('.') FROM @textXML.nodes('/') AS chunks(chunk)
        )
        SELECT @result = contents.value('.', 'varchar(max)') FROM doc
        RETURN @result
    END
    GO
    
    --Variable declaration & test string
    DECLARE @str AS VARCHAR(500) 
    SET @str = 'Alpha Beta <FONT SIZE=2 FACE="Arial"> i will be @ gamma</font> one two three </FONT><FONT SIZE=2 FACE="Arial">atul.kale@bca.net</FONT><FONT SIZE=2 FACE="Arial"> i hate mondays'
    
    --First lets strip HTML out of the given string
    --I am storing it into the same variable may be you might want to use another one!
    SELECT @str = dbo.udf_StripHTML(@str)
    
    --Extract the Email ID from the given string. 
    --Even if there are @ symbol within the string multiple times it ignore the one which is not part of email address
    --if there are multiple email ids this script would list the first one ONLY
    --Source: http://www.simple-talk.com/sql/t-sql-programming/patindex-workbench/
    SELECT  
        CASE WHEN AtIndex=0 THEN '' --no email found
        ELSE RIGHT(head, PATINDEX('% %', REVERSE(head) + ' ') - 1) + LEFT(tail + ' ', PATINDEX('% %', tail + ' '))
        END EmailAddress
    FROM 
    (
        SELECT RIGHT(EmbeddedEmail, [len] - AtIndex) AS tail,
                 LEFT(EmbeddedEmail, AtIndex) AS head, AtIndex
        FROM 
        (
            SELECT 
                PATINDEX('%[A-Z0-9-]@[A-Z0-9-]%', EmbeddedEmail+' ') AS AtIndex,
                LEN(EmbeddedEmail+'|')-1 AS [len], embeddedEmail
                FROM (
                      SELECT @str 
                     ) AS ListOfCompanies (EmbeddedEmail)
        )f
    )g;