使用PATINDEX在T-SQL中查找不同长度的模式

时间:2012-03-19 15:04:38

标签: sql sql-server tsql

我正在寻找从一些varchars中拉出浮点数,使用PATINDEX()来发现它们。我知道在每个varchar字符串中,我只对存在的第一个浮点感兴趣,但它们可能有不同的长度。

e.g。

'some text 456.09 other text'
'even more text 98273.453 la la la'

我通常会将它们与正则表达式匹配

  "[0-9]+[.][0-9]+"

但是,我找不到PATINDEX接受的+运算符的等价物。所以他们需要(分别)匹配:

'[0-9][0-9][0-9].[0-9][0-9]' and '[0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9]' 

有没有办法将这两个示例varchars与一个有效的PATINDEX模式匹配?

6 个答案:

答案 0 :(得分:12)

我刚才在博客上发表过这篇文章。 Extracting numbers with SQL server

Declare @Temp Table(Data VarChar(100))

Insert Into @Temp Values('some text 456.09 other text')
Insert Into @Temp Values('even more text 98273.453 la la la')
Insert Into @Temp Values('There are no numbers in this one')

Select Left(
             SubString(Data, PatIndex('%[0-9.-]%', Data), 8000),
             PatIndex('%[^0-9.-]%', SubString(Data, PatIndex('%[0-9.-]%', Data), 8000) + 'X')-1)
From   @Temp

答案 1 :(得分:2)

通配符。

SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','some text 456.09 other text')
SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','even more text 98273.453 la la la')

答案 2 :(得分:1)

是的,您需要链接到clr以获得正则表达式支持。但是如果PATINDEX不能满足您的需求,那么regex就是为此而设计的。

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

答案 3 :(得分:1)

应该检查健壮性(例如,如果你只有一个int,那会怎样),但这只是为了让你走上正轨:

if exists (select routine_name from information_schema.routines where routine_name = 'GetFirstFloat')
    drop function GetFirstFloat
go

create function GetFirstFloat (@string varchar(max))
returns float
as
begin
    declare @float varchar(max)
    declare @pos int

    select @pos = patindex('%[0-9]%', @string)
    select @float = ''

    while isnumeric(substring(@string, @pos, 1)) = 1
    begin
        select @float = @float + substring(@string, @pos, 1)
        select @pos = @pos + 1
    end

    return cast(@float as float)
end
go


select dbo.GetFirstFloat('this is a string containing pi 3.14159216 and another non float 3 followed by a new fload 5.41 and that''s it')
select dbo.GetFirstFloat('this is a string with no float')
select dbo.GetFirstFloat('this is another string with an int 3')

答案 4 :(得分:0)

PATINDEX不够强大。你应该使用正则表达式。

自SQL Server 2005以来,SQL Server具有正则表达式支持。

答案 5 :(得分:0)

鉴于模式的长度会有所不同,你不会花费大量时间与PATINDEX合作。 There is another post that I wrote,我已经修改过以完成你在这里尝试做的事情。这对你有用吗?

CREATE TABLE #nums (n INT)
DECLARE @i INT 
SET @i = 1
WHILE @i < 8000 
BEGIN
    INSERT #nums VALUES(@i)
    SET @i = @i + 1
END

CREATE TABLE #tmp (
  id INT IDENTITY(1,1) not null,
  words VARCHAR(MAX) null
)

INSERT INTO #tmp
VALUES('I''m looking for a number, regardless of length, even 23.258 long'),('Maybe even pi which roughly 3.14159265358,'),('or possibly something else that isn''t a number')

UPDATE #tmp SET words = REPLACE(words, ',',' ')

;WITH CTE AS (SELECT ROW_NUMBER() OVER (ORDER BY ID) AS rownum, ID, NULLIF(SUBSTRING(' ' + words + ' ' , n , CHARINDEX(' ' , ' ' + words + ' ' , n) - n) , '') AS word
    FROM #nums, #tmp
    WHERE ID <= LEN(' ' + words + ' ') AND SUBSTRING(' ' + words + ' ' , n - 1, 1) = ' ' 
    AND CHARINDEX(' ' , ' ' + words + ' ' , n) - n > 0),
    ids AS (SELECT ID, MIN(rownum) AS rownum FROM CTE WHERE ISNUMERIC(word) = 1 GROUP BY id)
SELECT CTE.rownum, cte.id, cte.word
FROM CTE, ids WHERE cte.id = ids.id AND cte.rownum = ids.rownum

origional post

更详细地介绍了代码的解释和来源