SQLServer从匹配模式的字符串特定单词中选择

时间:2014-05-23 08:20:54

标签: sql-server regex

我有一点问题:

(id, title, text)这样的表格包含2132-12-42 trash trash 2130-10-21 trash trash等文字数据......

我想要实现的是从文本列中仅选择以下日期:

[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]

到目前为止,我设法列出没有其余字符串的日期..但只有第一个日期:

SELECT *,
SUBSTRING([text],NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', [text]),0),10) AS [date]
FROM table

因此,在[日期]列中提供的示例中,它会显示2132-12-42,但不会显示2130-10-21 ...

有没有办法选择匹配模式的所有单词而不仅仅是第一个?

我正在使用Sql Server 2012。

2 个答案:

答案 0 :(得分:0)

此代码根据给定字符串中的模式选择所有日期:

DECLARE @IND INT
DECLARE @Str NVARCHAR(MAX)

SET @Str = '2132-12-42 trash trash 2130-10-21 trash trash'

SET @IND = PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', @Str)

WHILE(@IND > 0)
BEGIN

    SELECT SUBSTRING(@Str, @IND,  10)

    SET @Str = SUBSTRING(@Str, @IND+11,  LEN(@Str))

    SET @IND = PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', @Str)

END

答案 1 :(得分:0)

以下是使用递归Common表表达式从所有行中获取所有日期的示例:

DECLARE @table TABLE(id INT, title NVARCHAR(255), [text] NVARCHAR(MAX))
INSERT @table
        ( id, title, [text] ) VALUES  
        ( 1, 'test_1', '2132-12-22 blah blah blah some thing else 2130-10-21 something else etc 2010-05-06'),
        ( 2, 'test_2', 'blasdasdasdasda asdasdasd asdasdasd'),
        ( 3, 'test_3', 'blasdasdasdasda 2013-06-07 2015-01-01 asdasdasd asdasdasd')


;WITH cte AS
(
SELECT id, title, [text],
    SUBSTRING([text],NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', [text]),0),10) AS [date],
    SUBSTRING([text],NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', [text]),0)+10,LEN([text])) AS [rest],
    1 AS dateNum
FROM @table
WHERE PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', [text]) > 0
UNION ALL
SELECT id, title, [text], 
    SUBSTRING(rest,NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', rest),0),10) AS [date],
    SUBSTRING(rest,NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', rest),0)+10,LEN(rest)) AS [rest],
    dateNum + 1 AS dateNum
FROM cte
WHERE PATINDEX('%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]%', rest) > 0
)
SELECT id, title , [date], dateNum, [text] FROM cte

您可以看到它几乎是您的原始查询,但它只是跟踪每次迭代的余数,并在余数内查询。

据我所知,似乎正常工作......

结果:

id  title   date    dateNum text
1   test_1  2132-12-22  1   2132-12-22 blah blah blah some thing else 2130-10-21 something else etc 2010-05-06
3   test_3  2013-06-07  1   blasdasdasdasda 2013-06-07 2015-01-01 asdasdasd asdasdasd
3   test_3  2015-01-01  2   blasdasdasdasda 2013-06-07 2015-01-01 asdasdasd asdasdasd
1   test_1  2130-10-21  2   2132-12-22 blah blah blah some thing else 2130-10-21 something else etc 2010-05-06
1   test_1  2010-05-06  3   2132-12-22 blah blah blah some thing else 2130-10-21 something else etc 2010-05-06