比较两个文档,父元素和子元素的排序方式不同

时间:2014-02-07 18:07:55

标签: java xml parsing comparison xmlunit

我正在尝试对生成xml的一些方法进行单元测试。我有一个预期的xml字符串和结果字符串,在googling和搜索堆栈溢出后,我找到了XMLUnit。但是,它似乎并不处理一种特殊情况,即不同顺序中的重复元素包含不同顺序的元素。例如:

预期的XML:

<graph>
  <parent>
    <foo>David</foo>
    <bar>Rosalyn</bar>
  </parent>
  <parent>
    <bar>Alexander</bar>
    <foo>Linda</foo>
  </parent>
</graph>

实际XML:

<graph>
  <parent>
    <foo>Linda</foo>
    <bar>Alexander</bar>
  </parent>
  <parent>
    <bar>Rosalyn</bar>
    <foo>David</foo>
  </parent>
</graph>

您可以看到父节点重复,其内容可以按任何顺序排列。这两个xml片段应该是等价的,但是我看到的stackoverflow示例中没有任何内容可以解决这个问题。 (Best way to compare 2 XML documents in Java) (How can I compare two similar XML files in XMLUnit

我已经尝试从xml字符串创建Documents,逐步遍历每个预期的父节点,然后将它与每个实际的父节点进行比较,看看其中一个是否等效。

在我看来,对于一些应该是相对常见的比较的东西来说,重新发明轮子。 XMLUnit似乎做了很多,也许我错过了一些东西,但从我所知道的,它在这个特殊情况下不足。

有更简单/更好的方法吗?

我的解决方案:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// parse and normalize expected xml
Document expectedXMLDoc = db.parse(new ByteArrayInputStream(resultXML.getBytes()));
expectedXMLDoc.normalizeDocument();
// parse and normalize actual xml
Document actualXMLDoc = db.parse(new ByteArrayInputStream(actual.getXml().getBytes()));
actualXMLDoc.normalizeDocument();
// expected and actual parent nodes
NodeList expectedParentNodes = expectedXMLDoc.getLastChild().getChildNodes();
NodeList actualParentNodes = actualXMLDoc.getLastChild().getChildNodes();

// assert same amount of nodes in actual and expected
assertEquals("actual XML does not have expected amount of Parent nodes", expectedParentNodes.getLength(), actualParentNodes.getLength());

// loop through expected parent nodes
for(int i=0; i < expectedParentNodes.getLength(); i++) {
    // create doc from node
    Node expectedParentNode = expectedParentNodes.item(i);    
    Document expectedParentDoc = db.newDocument();
    Node importedExpectedNode = expectedParentDoc.importNode(expectedParentNode, true);
    expectedParentDoc.appendChild(importedExpectedNode);

    boolean hasSimilar = false;
    StringBuilder  messages = new StringBuilder();

    // for each expected parent, find a similar parent
    for(int j=0; j < actualParentNodes.getLength(); j++) {
        // create doc from node
        Node actualParentNode = actualParentNodes.item(j);
        Document actualParentDoc = db.newDocument();
        Node importedActualNode = actualParentDoc.importNode(actualParentNode, true);
        actualParentDoc.appendChild(importedActualNode);

        // XMLUnit Diff
        Diff diff = new Diff(expectedParentDoc, actualParentDoc);
        messages.append(diff.toString());
        boolean similar = diff.similar();
        if(similar) {
            hasSimilar = true;
        }
    }
    // assert it found a similar parent node
    assertTrue("expected and actual XML nodes are not equivalent " + messages, hasSimilar);        
}    

3 个答案:

答案 0 :(得分:1)

使用带有添加<xsl:sort.../>的XSL身份转换按名称重新排序每个文档中的节点,然后比较排序的输出。对于某些节点(即顶级父节点),您可能需要使用特定的排序键来对内部内容进行排序。

这是一个让你入门的骨架:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>

    <!-- Identity Transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()">
                <xsl:sort select="name(.)"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <!-- Special handling for graph/parent nodes -->
    <xsl:template match="graph">
        <!-- Sort attributes using default above -->
        <xsl:apply-templates select="@*"/>
        <!-- Sort parent nodes by text of bar node -->
        <xsl:apply-templates select="parent">
            <xsl:sort select="bar/text()"/>
        </xsl:apply-templates>
    </xsl:template>
</xsl:stylesheet>

这适用于您发布的样本。根据实际数据的需要进行调整。

答案 1 :(得分:0)

您可以使用递归函数,因此它可以用于任何元素顺序不重要的xml结构,这里是伪代码:

public boolean isEqual(Node node1, Node node2)
{
    if nodes are not from the same type
        return false;
    if values of them are not the same
        return false;
    if size of their children are not the same
        return false;

    if they have no children
        return true;

    //compares each children of the node1 with the first child of node2
    for each child node of node1
        if(isEqual(node2.child(0), node)
        {
             matchFound = true;
             break;
        }

    if(!matchFound)
        return false;

    remove matched node from children of node1;
    remove matched node from children of node2;

    return isEqual(node1, node2)
}

答案 2 :(得分:0)

刚才意识到我没有为此选择答案。我最终使用了与我的解决方案非常相似的东西。这是对我有用的最终解决方案。我把它包装在一个类中以与junit一起使用,因此这些方法可以像任何其他junit断言一样使用。

如果所有孩子都需要整理,就像我的情况一样,你可以运行

assertEquivalentXml(expectedXML, testXML, null, null);

如果某些节点需要按随机顺序生成子节点和/或某些属性需要忽略:

assertEquivalentXml(expectedXML, testXML,
                new String[]{"dataset", "categories"}, new String[]{"color", "anchorBorderColor", "anchorBgColor"});

这是班级:

/**
 * A set of methods that assert XML equivalence specifically for XmlProvider classes. Extends 
 * <code>junit.framework.Assert</code>, meaning that these methods are recognised as assertions by junit.
 *
 * @author munick
 */
public class XmlProviderAssertions extends Assert {    

    /**
     * Asserts two xml strings are equivalent. Nodes are not expected to be in order. Order can be compared among the 
     * children of the top parent node by adding their names to nodesWithOrderedChildren 
     * (e.g. in <graph><dataset><set value="1"/><set value="2"/></dataset></graph> the top parent node is graph 
     * and we can expect the children of dataset to be in order by adding "dataset" to nodesWithOrderedChildren).
     * 
     * All attribute names and values are compared unless their name is in attributesToIgnore in which case only the 
     * name is compared and any difference in value is ignored.
     * 
     * @param expectedXML the expected xml string 
     * @param testXML the xml string being tested
     * @param nodesWithOrderedChildren names of nodes who's children should be in order
     * @param attributesToIgnore names of attributes who's values should be ignored
     */
    public static void assertEquivalentXml(String expectedXML, String testXML, String[] nodesWithOrderedChildren, String[] attributesToIgnore) {
        Set<String> setOfNodesWithOrderedChildren = new HashSet<String>();
        if(nodesWithOrderedChildren != null ) {
            Collections.addAll(setOfNodesWithOrderedChildren, nodesWithOrderedChildren);
        }

        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setCoalescing(true);
        dbf.setIgnoringElementContentWhitespace(true);
        dbf.setIgnoringComments(true);
        DocumentBuilder db = null;
        try {
            db = dbf.newDocumentBuilder();
        } catch (ParserConfigurationException e) {
            fail("Error testing XML");
        }

        Document expectedXMLDoc = null;
        Document testXMLDoc = null;
        try {
            expectedXMLDoc = db.parse(new ByteArrayInputStream(expectedXML.getBytes()));
            expectedXMLDoc.normalizeDocument();

            testXMLDoc = db.parse(new ByteArrayInputStream(testXML.getBytes()));
            testXMLDoc.normalizeDocument();
        } catch (SAXException e) {
            fail("Could not parse testXML");
        } catch (IOException e) {
            fail("Could not read testXML");
        }
        NodeList expectedChildNodes = expectedXMLDoc.getLastChild().getChildNodes();
        NodeList testChildNodes = testXMLDoc.getLastChild().getChildNodes();

        assertEquals("Test XML does not have expected amount of child nodes", expectedChildNodes.getLength(), testChildNodes.getLength());

        //compare parent nodes        
        Document expectedDEDoc = getNodeAsDocument(expectedXMLDoc.getDocumentElement(), db, false);        
        Document testDEDoc = getNodeAsDocument(testXMLDoc.getDocumentElement(), db, false);
        Diff diff = new Diff(expectedDEDoc, testDEDoc);
        assertTrue("Test XML parent node doesn't match expected XML parent node. " + diff.toString(), diff.similar());

        // compare child nodes
        for(int i=0; i < expectedChildNodes.getLength(); i++) {
            // expected child node
            Node expectedChildNode = expectedChildNodes.item(i);
            // skip text nodes
            if( expectedChildNode.getNodeType() == Node.TEXT_NODE ) {
                continue;
            }
            // convert to document to use in Diff
            Document expectedChildDoc = getNodeAsDocument(expectedChildNode, db, true);

            boolean hasSimilar = false;
            StringBuilder  messages = new StringBuilder();

            for(int j=0; j < testChildNodes.getLength(); j++) {
                // find child node in test xml
                Node testChildNode = testChildNodes.item(j);
                // skip text nodes
                if( testChildNode.getNodeType() == Node.TEXT_NODE ) {
                    continue;
                }
                // create doc from node
                Document testChildDoc = getNodeAsDocument(testChildNode, db, true);

                diff = new Diff(expectedChildDoc, testChildDoc);
                // if it doesn't contain order specific nodes, then use the elem and attribute qualifier, otherwise use the default
                if( !setOfNodesWithOrderedChildren.contains( expectedChildDoc.getDocumentElement().getNodeName() ) ) {
                    diff.overrideElementQualifier(new ElementNameAndAttributeQualifier());
                }
                if(attributesToIgnore != null) {
                    diff.overrideDifferenceListener(new IgnoreNamedAttributesDifferenceListener(attributesToIgnore));
                }
                messages.append(diff.toString());
                boolean similar = diff.similar();
                if(similar) {
                    hasSimilar = true;
                }
            }
            assertTrue("Test XML does not match expected XML. " + messages, hasSimilar);
        }
    }

    private static Document getNodeAsDocument(Node node, DocumentBuilder db, boolean deep) {
        // create doc from node
        Document nodeDoc = db.newDocument();
        Node importedNode = nodeDoc.importNode(node, deep);
        nodeDoc.appendChild(importedNode);
        return nodeDoc;
    }

}

/**
 * Custom difference listener that ignores differences in attribute values for specified attribute names. Used to 
 * ignore color attribute differences in FusionChartXml equivalence.
 */
class IgnoreNamedAttributesDifferenceListener implements DifferenceListener {
    Set<String> attributeBlackList;

    public IgnoreNamedAttributesDifferenceListener(String[] attributeNames) {        
        attributeBlackList = new HashSet<String>();
        Collections.addAll(attributeBlackList, attributeNames);
    }

    public int differenceFound(Difference difference) {
        int differenceId = difference.getId();
        if (differenceId == DifferenceConstants.ATTR_VALUE_ID) {
            if(attributeBlackList.contains(difference.getControlNodeDetail().getNode().getNodeName())) {
                return DifferenceListener.RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
            }
        }

        return DifferenceListener.RETURN_ACCEPT_DIFFERENCE;
    }

    public void skippedComparison(Node node, Node node1) {
        // left empty
    }
}