如何在XSL中防止重复?

时间:2010-04-19 18:20:16

标签: xslt duplicates

如何防止重复条目进入列表,然后理想情况下对该列表进行排序?我正在做的是,当缺少一个级别的信息时,从下面的级别获取信息,在上面的级别中构建缺失列表。目前,我有类似的XML:

<c03 id="ref6488" level="file">
    <did>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

然后我应用以下XSL:

<xsl:template match="c03/did">
    <xsl:choose>
        <xsl:when test="not(container)">
            <did>
                <!-- If no c03 container item is found, look in the c04 level for one -->
                <xsl:if test="../c04/did/container">

                    <!-- If a c04 container item is found, use the info to build a c03 version -->
                    <!-- Skip c03 container item, if still no c04 items found -->
                    <container label="Box" type="Box">

                        <!-- Build container list -->
                        <!-- Test for more than one item, and if so, list them, -->
                        <!-- separated by commas and a space -->
                        <xsl:for-each select="../c04/did">
                            <xsl:if test="position() &gt; 1">, </xsl:if>
                            <xsl:value-of select="container"/>
                        </xsl:for-each>
                    </container>                    
            </did>
        </xsl:when>

        <!-- If there is a c03 container item(s), list it normally -->
        <xsl:otherwise>
            <xsl:copy-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

但是我得到了

的“容器”结果
<container label="Box" type="Box">156, 156, 154</container>

当我想要的是

<container label="Box" type="Box">154, 156</container>

以下是我想要获得的完整结果:

<c03 id="ref6488" level="file">
    <did>
        <container label="Box" type="Box">154, 156</container>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

提前感谢您的帮助!

6 个答案:

答案 0 :(得分:2)

此问题无需XSLT 2.0解决方案

这是一个XSLT 1.0解决方案,比当前选择的XSLT 2.0解决方案更紧凑(35行与43行):

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="kBoxContainerByVal"
     match="container[@type='Box']" use="."/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="c03/did[not(container)]">
   <xsl:copy>

   <xsl:variable name="vContDistinctValues" select=
    "/*/*/*/container[@type='Box']
            [generate-id()
            =
             generate-id(key('kBoxContainerByVal', .)[1])
            ]
            "/>

    <container label="Box" type="Box">
      <xsl:for-each select="$vContDistinctValues">
        <xsl:sort data-type="number"/>

        <xsl:value-of select=
        "concat(., substring(', ', 1 + 2*(position() = last())))"/>
      </xsl:for-each>
    </container>
    <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

在最初提供的XML文档上应用此转换时,会生成正确的所需结果

<c03 id="ref6488" level="file">
   <did>
      <container label="Box" type="Box">156, 154</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
   <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
   </c04>
   <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
   </c04>
   <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
   </c04>
</c03>

<强>更新

我没有注意到容器编号必须出现排序的要求。现在解决方案反映了这一点。

答案 1 :(得分:1)

尝试在xslt中使用Key组,这是一篇关于Muenchian方法的文章,它应该有助于消除重复。 http://www.jenitennison.com/xslt/grouping/muenchian.html

答案 2 :(得分:1)

请尝试以下代码:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:output indent="yes"></xsl:output>

<xsl:template match="node() | @*">
  <xsl:copy>
    <xsl:apply-templates select="node() | @*"/>
  </xsl:copy>
</xsl:template>

  <xsl:template match="c03/did">
    <xsl:choose>
      <xsl:when test="not(container)">
        <did>
          <!-- If no c03 container item is found, look in the c04 level for one -->
          <xsl:if test="../c04/did/container">
            <xsl:variable name="foo" select="../c04/did/container[@type='Box']/text()"/>
            <!-- If a c04 container item is found, use the info to build a c03 version -->
            <!-- Skip c03 container item, if still no c04 items found -->
            <container label="Box" type="Box">

              <!-- Build container list -->
              <!-- Test for more than one item, and if so, list them, -->
              <!-- separated by commas and a space -->
              <xsl:for-each select="distinct-values($foo)">
                <xsl:sort />
                <xsl:if test="position() &gt; 1">, </xsl:if>
                <xsl:value-of select="." />
              </xsl:for-each>
            </container>
            <xsl:apply-templates select="*" />
          </xsl:if>
        </did>
      </xsl:when>

      <!-- If there is a c03 container item(s), list it normally -->
      <xsl:otherwise>
        <xsl:copy-of select="."/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

它看起来像你想要的输出:

<?xml version="1.0" encoding="UTF-8"?>
<c03 id="ref6488" level="file">
  <did>
      <container label="Box" type="Box">154, 156</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
  <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
  </c04>
  <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
  </c04>
  <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
  </c04>
</c03>

诀窍是一起使用<xsl:sort>distinct-values()。请参阅Michael Key“XSLT 2.0和XPATH 2.0”中的(IMHO)精彩书籍

答案 3 :(得分:1)

稍微缩短的XSLT 2.0版本,结合其他答案的方法。请注意,排序是按字母顺序排列的,因此如果找到标签“54”和“156”,则输出将为“156,54”。如果需要进行数字排序,请使用<xsl:sort select="number(.)"/>代替<xsl:sort/>

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/> 
    <xsl:strip-space elements="*"/>

    <xsl:template match="node()|@*"> 
        <xsl:copy> 
            <xsl:apply-templates select="node()|@*"/> 
        </xsl:copy> 
    </xsl:template> 

    <xsl:template match="c03/did[not(container)]">
        <xsl:variable name="containers" 
                      select="../c04/did/container[@label='Box'][text()]"/>
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:if test="$containers">
                <container label="Box" type="Box">
                    <xsl:for-each select="distinct-values($containers)">
                        <xsl:sort/>
                        <xsl:if test="position() != 1">, </xsl:if>
                        <xsl:value-of select="."/>
                    </xsl:for-each>
                </container> 
            </xsl:if>
            <xsl:apply-templates select="node()"/> 
        </xsl:copy> 
    </xsl:template> 
</xsl:stylesheet>

答案 4 :(得分:1)

真正的XSLT 2.0解决方案,也很短

<xsl:stylesheet  version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs"
>
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="c03/did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*"/>

      <xsl:variable name="vContDistinctValues" as="xs:integer*">
        <xsl:perform-sort select=
          "distinct-values(/*/*/*/container[@type='Box']/text()/xs:integer(.))">
          <xsl:sort/>
        </xsl:perform-sort>
      </xsl:variable>

      <xsl:if test="$vContDistinctValues">
        <container label="Box" type="Box">
          <xsl:value-of select="$vContDistinctValues" separator=","/>
        </container>
      </xsl:if>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

请注意:

  1. 使用类型可以避免在data-type中指定<xsl:sort/>

  2. 使用separator

  3. <xsl:value-of/>属性

答案 5 :(得分:0)

以下XSLT 1.0转换可以满足您的需求

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> 
  <xsl:output encoding="utf-8" />

  <!-- key to index containers by these three distinct qualities: 
       1: their ancestor <c??> node (represented as its unique ID)
       2: their @type attribute value
       3: their node value (i.e. their text) -->
  <xsl:key 
    name  = "kContainer" 
    match = "container"
    use   = "concat(generate-id(../../..), '|', @type, '|', .)"
  />

  <!-- identity template to copy everything as is by default -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <!-- special template for <did>s without a <container> child -->
  <xsl:template match="did[not(container)]">
    <xsl:copy>
      <xsl:copy-of select="@*" />
      <container label="Box" type="Box">
        <!-- from subordinate <container>s of type Box, use the ones
             that are *the first* to have that certain combination 
             of the three distinct qualities mentioned above -->
        <xsl:apply-templates mode="list-values" select="
          ../*/did/container[@type='Box'][
            generate-id()
            =
            generate-id(
              key(
                'kContainer', 
                concat(generate-id(../../..), '|', @type, '|', .)
              )[1]
            )
          ]
        ">
          <!-- sort them by their node value -->
          <xsl:sort select="." data-type="number" />
        </xsl:apply-templates>
      </container>
      <xsl:apply-templates select="node()" />
    </xsl:copy>
  </xsl:template>

  <!-- generic template to make list of values from any node-set -->
  <xsl:template match="*" mode="list-values">
    <xsl:value-of select="." />
    <xsl:if test="position() &lt; last()">
      <xsl:text>, </xsl:text>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

返回

<c03 id="ref6488" level="file">
  <did>
    <container label="Box" type="Box">154, 156</container>
    <unittitle>Clinic Building</unittitle>
    <unitdate era="ce" calendar="gregorian">1947</unitdate>
  </did>
  <c04 id="ref34582" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <container label="Folder" type="Folder">3</container>
    </did>
  </c04>
  <c04 id="ref6540" level="file">
    <did>
      <container label="Box" type="Box">156</container>
      <unittitle>Contact prints</unittitle>
    </did>
  </c04>
  <c04 id="ref6606" level="file">
    <did>
      <container label="Box" type="Box">154</container>
      <unittitle>Negatives</unittitle>
    </did>
  </c04>
</c03>

generate-id() = generate-id(key(...)[1])部分就是所谓的Muenchian分组。除非您可以使用XSLT 2.0,否则就可以了。