如何在应用深度相等之前从序列中删除元素/属性?

时间:2013-09-16 14:15:45

标签: xml xslt

此问题是问题here的扩展。 @Martin Honnen提供的答案here几乎完全符合我的要求,但是当deep-equal被调用时,我错误地预期我希望之前模板删除的元素/属性已被删除(因此序列中不存在传递给deep-equal)。

如何从传递到deep-equal的序列中删除元素/属性,或者告诉deep-equal忽略某些元素/属性?

XSL:

<!--
    When a file is transformed using this stylesheet the output will be
    formatted as follows:

    1.)  Elements named "info" will be removed
    2.)  Attributes named "file_line_nr" or "file_name" will be removed
    3.)  Comments will be removed
    4.)  Processing instructions will be removed
    5.)  XML declaration will be removed
    6.)  Extra whitespace will be removed
    7.)  Empty attributes will be removed
    8.)  Elements which have no attributes, child elements, or text will be removed
    9.)  Duplicate sibling elements will be removed
    10.) All elements will be sorted by name recursively
    11.) All attributes will be sorted by name
-->
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <!-- Set output options -->
    <xsl:output indent="yes" method="xml" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- Match any attribute -->
    <xsl:template match="@*">
        <xsl:copy/>
    </xsl:template>

    <!-- Match any element -->
    <xsl:template match="*">
        <xsl:copy>
          <xsl:apply-templates select="@*">
            <xsl:sort select="local-name()"/>
           </xsl:apply-templates>
           <xsl:for-each-group select="node() except (processing-instruction(), comment())" group-adjacent="boolean(self::*)">
             <xsl:choose>
               <xsl:when test="current-grouping-key()">
                 <xsl:apply-templates select="current-group()">
                   <xsl:sort select="local-name()"/>
                 </xsl:apply-templates>
               </xsl:when>
               <xsl:otherwise>
                 <xsl:apply-templates select="current-group()"/>
               </xsl:otherwise>
             </xsl:choose>
           </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

    <!-- Elements/attributes to ignore -->
    <xsl:template match="@*[normalize-space()='']|info|@file_line_nr|@file_name|*[not(@*|node())]"/>

    <!-- Ignore element nodes which are deep-equal to a preceding sibling element -->
    <xsl:template match="*[some $ps in preceding-sibling::* satisfies deep-equal(., $ps)]"/>

</xsl:stylesheet>

XML输入:

<root>
    <!-- foo #1 -->
    <foo a="a" file_line_nr="1"/>

    <!-- bar #1 -->
    <bar>
    some text
        <info a="a"/>
    </bar>

    <!-- foo #2 -->
    <foo a="a" file_line_nr="2"/><!-- This should be removed because it is identical to the foo #1 except for the "file_line_nr" attribute which should be removed/ignored -->

    <!-- baz #1 -->
    <baz file_name="some_file.h">
        <bam a="a"/>
    </baz>

    <!-- bar #2 -->
    <bar><!-- This should be removed because it is identical to the bar #1 except for the "info" child element which should be removed/ignored -->
        some text
        <info b="b"/>
    </bar>

    <!-- baz #2 -->
    <baz file_name="some_other_file.h"><!-- This should be removed because it is identical to the baz #1 except for the "file_name" attribute which should be removed/ignored -->
        <bam a="a"/>
    </baz>
</root>

期望输出:

<root>
    <foo a="a"/>
    <bar>
    some text
    </bar>
    <baz>
        <bam a="a"/>
    </baz>
</root>

1 个答案:

答案 0 :(得分:2)

您的模板都与原始XML树中的节点匹配,而不是您输出的节点。您必须将处理分为两个阶段,首先将所有不需要的元素,属性等剥离到变量中,然后对该变量执行深度相等的过滤。像这样:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <!-- Set output options -->
    <xsl:output indent="yes" method="xml" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="/">
      <!-- first pass with default-mode templates -->
      <xsl:variable name="pass1">
        <xsl:apply-templates />
      </xsl:variable>
      <!-- second pass with dedup templates -->
      <xsl:apply-templates select="$pass1" mode="dedup" />
    </xsl:template>

    <!-- pre-processing templates -->

    <!-- Match any attribute -->
    <xsl:template match="@*">
        <xsl:copy/>
    </xsl:template>

    <!-- Match any element -->
    <xsl:template match="*">
        <xsl:copy>
          <xsl:apply-templates select="@*">
            <xsl:sort select="local-name()"/>
           </xsl:apply-templates>
           <xsl:for-each-group select="node() except (processing-instruction(), comment())" group-adjacent="boolean(self::*)">
             <xsl:choose>
               <xsl:when test="current-grouping-key()">
                 <xsl:apply-templates select="current-group()">
                   <xsl:sort select="local-name()"/>
                 </xsl:apply-templates>
               </xsl:when>
               <xsl:otherwise>
                 <xsl:apply-templates select="current-group()"/>
               </xsl:otherwise>
             </xsl:choose>
           </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

    <!-- Elements/attributes to ignore -->
    <xsl:template match="@*[normalize-space()='']|info|@file_line_nr|@file_name|*[not(@*|node())]"/>


    <!-- de-duplication templates -->

    <!-- Ignore element nodes which are deep-equal to a preceding sibling element -->
    <xsl:template mode="dedup" match="*[some $ps in preceding-sibling::* satisfies deep-equal(., $ps)]"/>

    <xsl:template mode="dedup" match="@*|node()">
      <xsl:copy><xsl:apply-templates mode="dedup" select="@*|node()"/></xsl:copy>
    </xsl:template>

</xsl:stylesheet>