我有一个HTML字段,其中包含从word文档中提取的整个网页的HTML。
在此HTML中可以是以下内容:
<p>Please refer to <|Any combination of words|> policy.</p>
我需要抓取|
和|
之间的任何内容。诀窍在于整个文档中有多个|'s
,因此它只需要|'s
之间的Please refer to and policy
。
然后我需要用HTML链接<a href="Any combination of words">Any combination of words</a>
所以,如果我在下面运行代码:
<p>Please refer to <|Specific Policy Name|> policy.</p>
它会将此<|Specific Policy Name|>
替换为:
<a href="Specific Policy Name">Specific Policy Name</a>
这可以用SQL吗?
答案 0 :(得分:1)
试试这个解决方案:
SET NOCOUNT ON;
DECLARE @MyTable TABLE
(
ID INT IDENTITY(1,1) PRIMARY KEY,
OldContent NVARCHAR(MAX) NOT NULL,
NewContent NVARCHAR(MAX) NULL
);
INSERT INTO @MyTable (OldContent)
VALUES (N'<p>Please refer to <|Specific Policy Name<| policy.</p>');
WITH UpdateCTE
AS
(
SELECT b.NewContent,STUFF(b.InnerText,b.StartIndex-5,b.EndIndex-b.StartIndex+10,'<a href="'+b.[Text]+'">'+b.[Text]+'</a>') AS ChangedText
FROM
(
SELECT a.*,SUBSTRING(a.InnerText,a.StartIndex,a.EndIndex-a.StartIndex) AS [Text]
FROM
(
SELECT PATINDEX('%Please refer to <|%',t.OldContent)+21 AS StartIndex,
PATINDEX('%<| policy.%',t.OldContent) AS EndIndex,
t.OldContent AS InnerText,
t.NewContent
FROM @MyTable t
) a
) b
)
UPDATE UpdateCTE
SET NewContent = ChangedText;
SELECT *
FROM @MyTable x;
结果:
ID OldContent NewContent
--- ------------------------------------------------------------- ------------------------------------------------------------------------------------
1 <p>Please refer to <|Specific Policy Name<| policy.</p> <p>Please refer to <a href="Specific Policy Name">Specific Policy Name</a> policy.</p>