Return Regex Matches from a sql query

时间:2015-06-25 09:57:15

标签: sql-server regex

I have a table that stores html templates which contain markup with placeholders in key locations, something like this ...

<div>
   <div>{FirstName}</div>
   <div>{LastName}</div>
</div>

I want to write a query that returns from the table all of the placeholders used from all rows.

SELECT Template 
FROM MyTable
WHERE ????

So for the above example the result I want is ...

{FirstName}
{LastName}

I have seen people using regex in SQL but can't figure out how to only return the matches and not the whole column value. It's also worth noting that I want a result per match ideally but if I got a comma separated list per row that matched or something that would do.

2 个答案:

答案 0 :(得分:1)

See this:

CREATE TABLE #temp(id int identity(1,1), template nvarchar(max))

INSERT INTO #temp(template)
SELECT REPLICATE(N'<div>
   <div>{FirstName}</div>
   <div>{LastName}</div>
</div>',1000)

;WITH cte AS(
    SELECT id, 
        SUBSTRING(template,CHARINDEX(N'{',template),CHARINDEX(N'}',template)-CHARINDEX(N'{',template)+1) as match,
        SUBSTRING(template,CHARINDEX(N'}',template)+1,LEN(template)) as templateRest
    FROM #temp
    UNION ALL
    SELECT id, 
        SUBSTRING(templateRest,CHARINDEX(N'{',templateRest),CHARINDEX(N'}',templateRest)-CHARINDEX(N'{',templateRest)+1) as match,
        SUBSTRING(templateRest,CHARINDEX(N'}',templateRest)+1,LEN(templateRest)) as templateRest
    FROM cte
    WHERE templateRest LIKE N'%}%'
)
SELECT t.id, t.template, c.match
-- Only distinctive:
-- SELECT DISTINCT t.id, t.template c.match
FROM cte AS c
INNER JOIN #temp AS t
        ON c.id = t.id
OPTION(MAXRECURSION 1000) -- if needed, this value could still be raised

DROP TABLE #temp
GO

You can filter it for the template and retrieve all matches.

答案 1 :(得分:1)

我会使用numbers table来解决这个问题,无论如何都是非常有用的,所以如果你没有,我会考虑创建一个,但为了一个完整的答案,我会假设你不要有一个,不能创建一个。在这种情况下,您可以使用以下方式轻松生成数字列表:

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
--N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)
Numbers (Number) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) FROM N3)

SELECT Number
FROM Numbers;

这从一个用table value constructor(N1)创建的10行表开始,然后将该表与自身连接以获得100行(N2)的表,然后将N2连接到自身以获得10,000行(N3),在最终使用ROW_NUMBER()获取每行中的序号之前,可以根据需要重复此操作。 Aaron Bertrand在generating a set or sequence without loops上做了一个非常全面的系列,这个方法在最前面(作为一种动态创建表的方法)。

获得此数字表后,您可以将其加入模板,使用SUBSTRING查找每个"{"的位置:

SELECT  t.Template,
        StartPosition = n.Number
FROM    dbo.T
        INNER JOIN Numbers n
            ON SUBSTRING(t.Template, n.Number, 1) = '{';

根据您的示例,这将返回16和43.然后您可以使用CHARINDEX查找每个"}"后面的"{"

SELECT  t.Template,
        StartPosition = n.Number,
        EndPosition = CHARINDEX('}', t.template, n.Number) + 1
FROM    dbo.T
        INNER JOIN Numbers n
            ON SUBSTRING(t.Template, n.Number, 1) = '{';

然后,您可以再次使用SUBSTRING来提取每个开始和结束位置之间的术语。所以一个完整的工作示例是:

DECLARE @T TABLE (Template NVARCHAR(MAX));
INSERT @T (Template)
VALUES ('<div>
   <div>{FirstName}</div>
   <div>{LastName}</div>
</div>');

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
--N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)
Numbers (Number) AS (SELECT ROW_NUMBER() OVER(ORDER BY N) FROM N3)

SELECT  t.Template,
        StartPosition = n.Number,
        EndPosition = CHARINDEX('}', t.template, n.Number) + 1,
        Term = SUBSTRING(t.template, n.Number, CHARINDEX('}', t.template, n.Number) + 1 - n.Number)
FROM    @T t
        INNER JOIN Numbers n
            ON SUBSTRING(t.Template, n.Number, 1) = '{';