T-SQL:当长度> 1时,如何比较XML类型的两个变量? VARCHAR(MAX)?

时间:2012-01-26 03:43:13

标签: sql-server xml tsql comparison

仅使用SQL Server 2008 R2(这将在存储过程中),如何确定XML类型的两个变量是否相同?

这是我想要做的:

DECLARE @XmlA   XML
DECLARE @XmlB   XML

SET @XmlA = '[Really long Xml value]'
SET @XmlB = '[Really long Xml value]'

IF @XmlA = @XmlB
    SELECT 'Matching Xml!'

但是你可能知道,它会返回:

  

Msg 305,Level 16,State 1,Line 7 XML数据类型不能   比较或排序,除非使用IS NULL运算符。

我可以转换为VarChar(MAX)并进行比较,但这只会比较前2MB。还有另一种方式吗?

5 个答案:

答案 0 :(得分:2)

检查此SQL函数:

CREATE FUNCTION [dbo].[CompareXml]
(
    @xml1 XML,
    @xml2 XML
)
RETURNS INT
AS 
BEGIN
    DECLARE @ret INT
    SELECT @ret = 0


    -- -------------------------------------------------------------
    -- If one of the arguments is NULL then we assume that they are
    -- not equal. 
    -- -------------------------------------------------------------
    IF @xml1 IS NULL OR @xml2 IS NULL 
    BEGIN
        RETURN 1
    END

    -- -------------------------------------------------------------
    -- Match the name of the elements 
    -- -------------------------------------------------------------
    IF  (SELECT @xml1.value('(local-name((/*)[1]))','VARCHAR(MAX)')) 
        <> 
        (SELECT @xml2.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
    BEGIN
        RETURN 1
    END

     ---------------------------------------------------------------
     --Match the value of the elements
     ---------------------------------------------------------------
    IF((@xml1.query('count(/*)').value('.','INT') = 1) AND (@xml2.query('count(/*)').value('.','INT') = 1))
    BEGIN
    DECLARE @elValue1 VARCHAR(MAX), @elValue2 VARCHAR(MAX)

    SELECT
        @elValue1 = @xml1.value('((/*)[1])','VARCHAR(MAX)'),
        @elValue2 = @xml2.value('((/*)[1])','VARCHAR(MAX)')

    IF  @elValue1 <> @elValue2
    BEGIN
        RETURN 1
    END
    END

    -- -------------------------------------------------------------
    -- Match the number of attributes 
    -- -------------------------------------------------------------
    DECLARE @attCnt1 INT, @attCnt2 INT
    SELECT
        @attCnt1 = @xml1.query('count(/*/@*)').value('.','INT'),
        @attCnt2 = @xml2.query('count(/*/@*)').value('.','INT')

    IF  @attCnt1 <> @attCnt2 BEGIN
        RETURN 1
    END


    -- -------------------------------------------------------------
    -- Match the attributes of attributes 
    -- Here we need to run a loop over each attribute in the 
    -- first XML element and see if the same attribut exists
    -- in the second element. If the attribute exists, we
    -- need to check if the value is the same.
    -- -------------------------------------------------------------
    DECLARE @cnt INT, @cnt2 INT
    DECLARE @attName VARCHAR(MAX)
    DECLARE @attValue VARCHAR(MAX)

    SELECT @cnt = 1

    WHILE @cnt <= @attCnt1 
    BEGIN
        SELECT @attName = NULL, @attValue = NULL
        SELECT
            @attName = @xml1.value(
                'local-name((/*/@*[sql:variable("@cnt")])[1])', 
                'varchar(MAX)'),
            @attValue = @xml1.value(
                '(/*/@*[sql:variable("@cnt")])[1]', 
                'varchar(MAX)')

        -- check if the attribute exists in the other XML document
        IF @xml2.exist(
                '(/*/@*[local-name()=sql:variable("@attName")])[1]'
            ) = 0
        BEGIN
            RETURN 1
        END

        IF  @xml2.value(
                '(/*/@*[local-name()=sql:variable("@attName")])[1]', 
                'varchar(MAX)')
            <>
            @attValue
        BEGIN
            RETURN 1
        END

        SELECT @cnt = @cnt + 1
    END

    -- -------------------------------------------------------------
    -- Match the number of child elements 
    -- -------------------------------------------------------------
    DECLARE @elCnt1 INT, @elCnt2 INT
    SELECT
        @elCnt1 = @xml1.query('count(/*/*)').value('.','INT'),
        @elCnt2 = @xml2.query('count(/*/*)').value('.','INT')


    IF  @elCnt1 <> @elCnt2
    BEGIN
        RETURN 1
    END


    -- -------------------------------------------------------------
    -- Start recursion for each child element
    -- -------------------------------------------------------------
    SELECT @cnt = 1
    SELECT @cnt2 = 1
    DECLARE @x1 XML, @x2 XML
    DECLARE @noMatch INT

    WHILE @cnt <= @elCnt1 
    BEGIN

        SELECT @x1 = @xml1.query('/*/*[sql:variable("@cnt")]')
    --RETURN CONVERT(VARCHAR(MAX),@x1)
    WHILE @cnt2 <= @elCnt2
    BEGIN
        SELECT @x2 = @xml2.query('/*/*[sql:variable("@cnt2")]')
        SELECT @noMatch = dbo.CompareXml( @x1, @x2 )
        IF @noMatch = 0 BREAK
        SELECT @cnt2 = @cnt2 + 1
    END

    SELECT @cnt2 = 1

        IF @noMatch = 1
        BEGIN
            RETURN 1
        END

        SELECT @cnt = @cnt + 1
    END

    RETURN @ret
END

以下是Source


该函数无法比较XML片段,例如当没有单个根元素时,例如:

SELECT dbo.CompareXml('<data/>', '<data/><data234/>') 

为了解决这个问题,您必须将XML包装在root元素中,当它们传递给函数或编辑函数时执行此操作。例如:

SELECT dbo.CompareXml('<r><data/></r>', '<r><data/><data234/></r>')  

答案 1 :(得分:1)

我偶然发现了this fairly comprehensive article,其中详细介绍了实际比较2个XML条目的内容以确定它们是否相同。这是有道理的,因为节点中的属性排序可能不同,即使它们的值完全相同。我建议你仔细阅读它,甚至实现这个功能,看看它是否对你有用......我快速尝试了它,它似乎对我有用吗?

答案 2 :(得分:1)

有两种不同的比较两种XML文档的方法,很大程度上取决于你想要容忍的差异:你肯定需要容忍编码,属性顺序,无关紧要的空格,数字字符引用和使用中的差异。属性分隔符,您应该也可以容忍使用注释,名称空间前缀和CDATA的差异。因此,将两个XML文档作为字符串进行比较绝对不是一个好主意 - 除非您首先调用XML规范化。

出于许多目的,XQuery deep-equals()函数做正确的事情(并且或多或少等同于比较两个XML文档的规范形式)。我不太了解微软的SQL Server SQL Server实现,告诉你如何从SQL级别调用它。

答案 3 :(得分:0)

您可以将字段转换为varbinary(max),哈希它们并比较哈希值。但是,如果XML是等价但不相同的话,你肯定会错过

要计算哈希值,您可以使用CLR函数:

using System;
using System.Data.SqlTypes;
using System.IO;

namespace ClrHelpers
{
    public partial class UserDefinedFunctions {
        [Microsoft.SqlServer.Server.SqlFunction]
        public static Guid HashMD5(SqlBytes data) {
            System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
            md5.Initialize();
            int len = 0;
            byte[] b = new byte[8192];
            Stream s = data.Stream;
            do {
                len = s.Read(b, 0, 8192);
                md5.TransformBlock(b, 0, len, b, 0);
            } while(len > 0);
            md5.TransformFinalBlock(b, 0, 0);
            Guid g = new Guid(md5.Hash);
            return g;
        }
    };
}

或者是sql函数:

CREATE FUNCTION dbo.GetMyLongHash(@data VARBINARY(MAX))
RETURNS VARBINARY(MAX)
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
    DECLARE @res VARBINARY(MAX) = 0x
    DECLARE @position INT = 1, @len INT = DATALENGTH(@data)

    WHILE 1 = 1
    BEGIN
        SET @res = @res + HASHBYTES('MD5', SUBSTRING(@data, @position, 8000))
        SET @position = @position+8000
        IF @Position > @len 
          BREAK
    END
    WHILE DATALENGTH(@res) > 16 SET @res= dbo.GetMyLongHash(@res)
    RETURN @res
END

答案 4 :(得分:0)

如果您可以使用SQL CLR,我建议使用XNode.DeepEquals Method编写一个函数:

var xmlTree1 = new XElement("Root",
    new XAttribute("Att1", 1),
    new XAttribute("Att2", 2),
    new XElement("Child1", 1),
    new XElement("Child2", "some content")
);
var xmlTree2 = new XElement("Root",
    new XAttribute("Att1", 1),
    new XAttribute("Att2", 2),
    new XElement("Child1", 1),
    new XElement("Child2", "some content")
);
Console.WriteLine(XNode.DeepEquals(xmlTree1, xmlTree2));

如果你不能,你可以编写自己的函数(参见SQL FIDDLE EXAMPLE):

CREATE function [dbo].[udf_XML_Is_Equal]
(
    @Data1 xml,
    @Data2 xml
)
returns bit
as
begin
    declare
        @i bigint, @cnt1 bigint, @cnt2 bigint,
        @Sub_Data1 xml, @Sub_Data2 xml,
        @Name varchar(max), @Value1 nvarchar(max), @Value2 nvarchar(max)

    if @Data1 is null or @Data2 is null
        return 1

    --=========================================================================================================
    -- If more than one root - recurse for each element
    --=========================================================================================================
    select
        @cnt1 = @Data1.query('count(/*)').value('.','int'),
        @cnt2 = @Data1.query('count(/*)').value('.','int')

    if @cnt1 <> @cnt2
        return 0        

    if @cnt1 > 1
    begin
        select @i = 1
        while @i <= @cnt1
        begin
            select
                @Sub_Data1 = @Data1.query('/*[sql:variable("@i")]'),
                @Sub_Data2 = @Data2.query('/*[sql:variable("@i")]')

            if dbo.udf_XML_Is_Equal_New(@Sub_Data1, @Sub_Data2) = 0
                return 0

            select @i = @i + 1
        end

        return 1
    end

    --=========================================================================================================
    -- Comparing root data
    --=========================================================================================================
    if @Data1.value('local-name(/*[1])','nvarchar(max)') <> @Data2.value('local-name(/*[1])','nvarchar(max)') 
        return 0

    if @Data1.value('/*[1]', 'nvarchar(max)') <> @Data2.value('/*[1]', 'nvarchar(max)')
        return 0

    --=========================================================================================================
    -- Comparing attributes
    --=========================================================================================================
    select
        @cnt1 = @Data1.query('count(/*[1]/@*)').value('.','int'),
        @cnt2 = @Data1.query('count(/*[1]/@*)').value('.','int')

    if @cnt1 <> @cnt2
        return 0

    if exists (
        select *
        from
        (
            select
                T.C.value('local-name(.)', 'nvarchar(max)') as Name,
                T.C.value('.', 'nvarchar(max)') as Value
            from @Data1.nodes('/*[1]/@*') as T(C)
        ) as D1
        full outer join
        (
            select
                T.C.value('local-name(.)', 'nvarchar(max)') as Name,
                T.C.value('.', 'nvarchar(max)') as Value
            from @Data2.nodes('/*[1]/@*') as T(C)
        ) as D2
        on D1.Name = D2.Name
        where
            not
            (
                D1.Value is null and D2.Value is null or
                D1.Value is not null and D2.Value is not null and D1.Value = D2.Value
            )
    )
        return 0


    --=========================================================================================================
    -- Recursively running for each child
    --=========================================================================================================
    select
        @cnt1 = @Data1.query('count(/*[1]/*)').value('.','int'),
        @cnt2 = @Data2.query('count(/*[1]/*)').value('.','int')

    if @cnt1 <> @cnt2
        return 0    

    select @i = 1
    while @i <= @cnt1        
    begin
        select
            @Sub_Data1 = @Data1.query('/*/*[sql:variable("@i")]'),
            @Sub_Data2 = @Data2.query('/*/*[sql:variable("@i")]')

        if dbo.udf_XML_Is_Equal(@Sub_Data1, @Sub_Data2) = 0
            return 0

        select @i = @i + 1
    end

    return 1
END