我正在尝试使用下面的SQL将& "
等HTML名称转换为等效的CHAR
值。我在SQL Server 2012中对此进行了测试。
测试1(这很好):
GO
DECLARE @inputString VARCHAR(MAX)= '&testString&'
DECLARE @codePos INT, @codeEncoded VARCHAR(7), @startIndex INT, @resultString varchar(max)
SET @resultString = LTRIM(RTRIM(@inputString))
SELECT @startIndex = PATINDEX('%&%', @resultString)
WHILE @startIndex > 0
BEGIN
SELECT @resultString = REPLACE(@resultString, '&', '&'), @startIndex=PATINDEX('%&%', @resultString)
END
PRINT @resultString
Go
输出:
&testString&
测试2(这不起作用): 由于上述工作,我试图扩展这个以处理更多字符如下:
DECLARE @htmlNames TABLE (ID INT IDENTITY(1,1), asciiDecimal INT, htmlName varchar(50))
INSERT INTO @htmlNames
VALUES (34,'"'),(38,'&'),(60,'<'),(62,'>'),(160,' '),(161,'¡'),(162,'¢')
-- I would load the full list of HTML names into this TABLE varaible, but removed for testing purposes
DECLARE @inputString VARCHAR(MAX)= '&testString&'
DECLARE @count INT = 0
DECLARE @id INT = 1
DECLARE @charCode INT, @htmlName VARCHAR(30)
DECLARE @codePos INT, @codeEncoded VARCHAR(7), @startIndex INT
, @resultString varchar(max)
SELECT @count=COUNT(*) FROM @htmlNames
WHILE @id <=@count
BEGIN
SELECT @charCode = asciiDecimal, @htmlname = htmlName
FROM @htmlNames
WHERE ID = @id
SET @resultString = LTRIM(RTRIM(@inputString))
SELECT @startIndex = PATINDEX('%' + @htmlName + '%', @resultString)
While @startIndex > 0
BEGIN
--PRINT @resultString + '|' + @htmlName + '|' + NCHAR(@charCode)
SELECT @resultString = REPLACE(@resultString, @htmlName, NCHAR(@charCode))
SET @startIndex=PATINDEX('%' + @htmlName + '%', @resultString)
END
SET @id=@id + 1
END
PRINT @resultString
GO
输出:
&testString&
我无法弄清楚我哪里出错了?任何帮助将非常感激。
我不想将字符串值加载到应用程序层,然后应用HTMLDecode
并保存回数据库。
编辑:
这一行SET @resultString = LTRIM(RTRIM(@inputString))
位于WHILE
内,因此我用@inputString
覆盖了结果。谢谢你,YanireRomero。
我也喜欢@ RichardDeeming的解决方案,但在这种情况下它并不适合我的需要。
答案 0 :(得分:16)
这是一个不需要循环的简单解决方案:
DECLARE @htmlNames TABLE
(
ID INT IDENTITY(1,1),
asciiDecimal INT,
htmlName varchar(50)
);
INSERT INTO @htmlNames
VALUES
(34,'"'),
(38,'&'),
(60,'<'),
(62,'>'),
(160,' '),
(161,'¡'),
(162,'¢')
;
DECLARE @inputString varchar(max)= '&test&quot;<String>"&';
DECLARE @resultString varchar(max) = @inputString;
-- Simple HTML-decode:
SELECT
@resultString = Replace(@resultString COLLATE Latin1_General_CS_AS, htmlName, NCHAR(asciiDecimal))
FROM
@htmlNames
;
SELECT @resultString;
-- Output: &test"<String>"&
-- Multiple HTML-decode:
SET @resultString = @inputString;
DECLARE @temp varchar(max) = '';
WHILE @resultString != @temp
BEGIN
SET @temp = @resultString;
SELECT
@resultString = Replace(@resultString COLLATE Latin1_General_CS_AS, htmlName, NCHAR(asciiDecimal))
FROM
@htmlNames
;
END;
SELECT @resultString;
-- Output: &test"<String>"&
编辑:根据@tomasofen的建议更改为NCHAR
,并根据@TechyGypo的建议,将{1}}函数添加了区分大小写的排序规则。
答案 1 :(得分:5)
为了提高性能,您不应该将其写为T-SQL语句或SQL标量值函数。 .NET库提供了出色的,快速的,最重要的是,可靠的 HTML解码。在我看来,您应该将其实现为SQL CLR,如下所示:
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
using System.Net;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction(
IsDeterministic = true,
IsPrecise = true,
DataAccess = DataAccessKind.None,
SystemDataAccess = SystemDataAccessKind.None)]
[return: SqlFacet(MaxSize = 4000)]
public static SqlString cfnHtmlDecode([SqlFacet(MaxSize = 4000)] SqlString input)
{
if (input.IsNull)
return null;
return System.Net.WebUtility.HtmlDecode(input.Value);
}
}
然后在你的T-SQL中,像这样调用它:
SELECT clr_schema.cfnHtmlDecode(column_name) FROM table_schema.table_name
答案 2 :(得分:2)
嘿,这是一个分配错误:
DECLARE @htmlNames TABLE (ID INT IDENTITY(1,1), asciiDecimal INT, htmlName varchar(50))
INSERT INTO @htmlNames
VALUES (34,'"'),(38,'&'),(60,'<'),(62,'>'),(160,' '),(161,'¡'),(162,'¢')
-- I would load the full list of HTML names into this TABLE varaible, but removed for testing purposes
DECLARE @inputString VARCHAR(MAX)= '&testString&'
DECLARE @count INT = 0
DECLARE @id INT = 1
DECLARE @charCode INT, @htmlName VARCHAR(30)
DECLARE @codePos INT, @codeEncoded VARCHAR(7), @startIndex INT
, @resultString varchar(max)
SELECT @count=COUNT(*) FROM @htmlNames
SET @resultString = LTRIM(RTRIM(@inputString))
WHILE @id <=@count
BEGIN
SELECT @charCode = asciiDecimal, @htmlname = htmlName
FROM @htmlNames
WHERE ID = @id
SELECT @startIndex = PATINDEX('%' + @htmlName + '%', @resultString)
While @startIndex > 0
BEGIN
--PRINT @resultString + '|' + @htmlName + '|' + NCHAR(@charCode)
SET @resultString = REPLACE(@resultString, @htmlName, NCHAR(@charCode))
SET @startIndex=PATINDEX('%' + @htmlName + '%', @resultString)
END
SET @id=@id + 1
END
PRINT @resultString
GO
这一行SET @resultString = LTRIM(RTRIM(@inputString))在里面,所以你覆盖了你的结果。
希望它有所帮助。
答案 3 :(得分:2)
“Richard Deeming”响应的一些额外帮助,为将来访问者尝试使用更多代码升级功能安全地打字:
INSERT INTO @htmlNames
VALUES
(34,'"'),
(38,'&'),
(60,'<'),
(62,'>'),
(160, ' '),
(161, '¡'),
(162, '¢'),
(163, '£'),
(164, '¤'),
(165, '¥'),
(166, '¦'),
(167, '§'),
(168, '¨'),
(169, '©'),
(170, 'ª'),
(171, '«'),
(172, '¬'),
(173, '­'),
(174, '®'),
(175, '¯'),
(176, '°'),
(177, '±'),
(178, '²'),
(179, '³'),
(180, '´'),
(181, 'µ'),
(182, '¶'),
(183, '·'),
(184, '¸'),
(185, '¹'),
(186, 'º'),
(187, '»'),
(188, '¼'),
(189, '½'),
(190, '¾'),
(191, '¿'),
(192, 'À'),
(193, 'Á'),
(194, 'Â'),
(195, 'Ã'),
(196, 'Ä'),
(197, 'Å'),
(198, 'Æ'),
(199, 'Ç'),
(200, 'È'),
(201, 'É'),
(202, 'Ê'),
(203, 'Ë'),
(204, 'Ì'),
(205, 'Í'),
(206, 'Î'),
(207, 'Ï'),
(208, 'Ð'),
(209, 'Ñ'),
(210, 'Ò'),
(211, 'Ó'),
(212, 'Ô'),
(213, 'Õ'),
(214, 'Ö'),
(215, '×'),
(216, 'Ø'),
(217, 'Ù'),
(218, 'Ú'),
(219, 'Û'),
(220, 'Ü'),
(221, 'Ý'),
(222, 'Þ'),
(223, 'ß'),
(224, 'à'),
(225, 'á'),
(226, 'â'),
(227, 'ã'),
(228, 'ä'),
(229, 'å'),
(230, 'æ'),
(231, 'ç'),
(232, 'è'),
(233, 'é'),
(234, 'ê'),
(235, 'ë'),
(236, 'ì'),
(237, 'í'),
(238, 'î'),
(239, 'ï'),
(240, 'ð'),
(241, 'ñ'),
(242, 'ò'),
(243, 'ó'),
(244, 'ô'),
(245, 'õ'),
(246, 'ö'),
(247, '÷'),
(248, 'ø'),
(249, 'ù'),
(250, 'ú'),
(251, 'û'),
(252, 'ü'),
(253, 'ý'),
(254, 'þ'),
(255, 'ÿ'),
(8364, '€');
<强> 编辑: 强>
如果您希望欧元符号有效(并且通常ASCII代码超过255),则需要在Richard Deeming代码中使用NCHAR而不是CHAR。