使用TSQL进行URL解码(针对扩展的Ascii)

时间:2013-08-21 20:57:49

标签: sql-server tsql user-defined-functions url-encoding string-decoding

我需要一些帮助来微调T-SQL函数以正确地解码包含URL的字符串。只是查询字符串参数是URLEcoded(而不是整个URL)。解码单字节字符时原始函数效果很好但是它不处理多字节字符。为了解决多字节字符的解码,例如西班牙语重音字符;我的计划是使用PATINDEX找到值并使用查找表替换它们(这是因为我们正在处理属于此类别的少量特殊字符)。

问题: 下面指定的模式没有返回任何匹配,因此我几乎停留在这一点。

示例: 模式'%[%] [0-9a-f] [0-9a-f]%'适用于单字节编码的字符。类似地,模式'%[%] [0-9a-f] [0-9a-f] [%] [0-9a-f] [0-9a-f] [%] [0-9a-f] [0 -9a-f] [%] [0-9a-f] [0-9a-f]%'应该用于双字节字符,例如(%C3%A9 - >)但事实并非如此。

这是我的代码:

 DECLARE @Position INT,
    @Base CHAR(16),
    @High TINYINT,
    @Low TINYINT,
    @Pattern VARCHAR(256),
    @URL VARCHAR(8000)

SET @Url = '%26Text1%3DFrom%20Ren%C3%A9%27s'

SELECT  @Base = '0123456789abcdef',
    @Pattern = '%[%][0-9a-f][0-9a-f][%][0-9a-f][0-9a-f]%',
    --@URL = REPLACE(@URL, '+', ' '),
    @Position = PATINDEX(@Pattern, @URL)

PRINT 'Position: ' + + CAST(@Position AS Varchar(256))

WHILE @Position > 0
    BEGIN
    SELECT  
        @High = CHARINDEX(SUBSTRING(@URL, @Position + 1, 1), @Base COLLATE Latin1_General_CI_AS),
        @Low = CHARINDEX(SUBSTRING(@URL, @Position + 2, 1), @Base COLLATE Latin1_General_CI_AS),
        @URL = STUFF(@URL, @Position, 6, '123456'),
        @Position = PATINDEX(@Pattern, @URL)

    PRINT 'High: ' + CAST(@High AS Varchar(256))

    PRINT @URL
END 

2 个答案:

答案 0 :(得分:-1)

@Pattern CHAR(21)正在截断

set nocount on
 DECLARE @Position INT,
    @Base CHAR(16),
    @High TINYINT,
    @Low TINYINT,
    @Pattern VARCHAR(200),
    @URL VARCHAR(8000)

SET @Url = '%26Text1%3DFrom%20Ren%C3%A9%27s'

SELECT  @Base = '0123456789abcdef',
    @Pattern = '%[%][0-9a-f][0-9a-f][%][0-9a-f][0-9a-f]%',
    --@URL = REPLACE(@URL, '+', ' '),
    @Position = PATINDEX(@Pattern, @URL)

select @URL
select @Pattern    
select @position

答案 1 :(得分:-1)

我在模式中遇到语法错误。仔细阅读文档后,我意识到我需要使用额外的百分号来逃避%符号。这是工作解决方案(替换值的子查询不起作用,但模式是):

    DECLARE @Position INT,
    @Base CHAR(16),
    @High TINYINT,
    @Low TINYINT,
    @Pattern nVARCHAR(256),
    @ToReplace nVARCHAR(256),
    @ReplaceWith nVARCHAR(256),
    @URL nVARCHAR(4000)

SET @Url = '%26Text1%3DFrom%20Ren%C3%A9%27s%C3'

SELECT  @Base = '0123456789abcdef',
    @Pattern = '%[%%][c-f][0-9]%%[0-9a-f]%',
    --@URL = REPLACE(@URL, '+', ' '),
    @Position = PATINDEX(@Pattern, @URL)

  PRINT 'Position: ' + + CAST(@Position AS Varchar(256))

WHILE @Position > 0
  BEGIN
  SELECT  
        @High = CHARINDEX(SUBSTRING(@URL, @Position + 1, 1), @Base COLLATE Latin1_General_CI_AS),
        @Low = CHARINDEX(SUBSTRING(@URL, @Position + 2, 1), @Base COLLATE Latin1_General_CI_AS),
        @ToReplace = SUBSTRING(@URL, @Position, 6),
        @ReplaceWith = (SELECT COALESCE([Text], 'Something') FROM dbo.ExtendedAsciiLookup WHERE UTF = @ToReplace),
        @URL = STUFF(@URL, @Position, 6, @ReplaceWith),
        @Position = PATINDEX(@Pattern, @URL)

        PRINT 'High: ' + CAST(@High AS Varchar(256))
        PRINT '@ToReplace: ' + CAST(COALESCE(@ToReplace,'') AS nVARCHAR(256))
        PRINT 'With: ' + CAST(COALESCE(@ReplaceWith,'') AS VARCHAR(256))

        PRINT @URL
END