使用distinct-values和xslt 2.0删除重复元素

时间:2015-11-11 23:19:54

标签: xslt xslt-2.0

我正在尝试解决我想从一系列元素中删除重复值的问题。

我已经玩了一段时间了,下面的代码看起来像我认为会起作用的东西,但我收到一个错误:

XPTY0020:前导'/'无法选择包含上下文项的树的根节点:上下文项不是节点

XSLT:

module Api
  module V1
    class LocationsController < ApplicationController
      respond_to :json

      def index
        @locations = Location.all
      end
    end
   end
end

XML:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:strip-space elements="*"/>
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="/">

        <xsl:for-each select="distinct-values(/tobject/tobject.subject/@tobject.subject.refnum)">
            <xsl:copy-of select="."/>
        </xsl:for-each>

    </xsl:template>
</xsl:stylesheet>

想要的结果:

<?xml version="1.0" encoding="UTF-8"?>
<tobject tobject.type="Utenriks">
    <tobject.property tobject.property.type="Nyheter"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>

2 个答案:

答案 0 :(得分:1)

  

下面的代码看起来像我认为会起作用的东西,但是   我收到一个错误:

     

XPTY0020:领先&#39; /&#39;无法选择树的根节点   包含上下文项:上下文项不是节点

运行代码时无法重现此错误 - 请参阅:http://xsltransform.net/gWvjQfa

但是,distinct-values()的结果是一系列,而不是 nodes 。您期望的结果 - 删除重复的元素 - 使用分组更容易实现:

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/tobject">
    <xsl:copy>
        <xsl:copy-of select="@* | tobject.property"/>
        <xsl:for-each-group select="tobject.subject" group-by="@tobject.subject.refnum">
            <xsl:copy-of select="current-group()[1]"/>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

答案 1 :(得分:0)

<强>予。一个更短的解决方案 ,它是纯XSLT 1.0 ,不需要不必要的元素名称。

另外,它的效率并不低于使用<xsl:for-each-group>的XSLT 2.0解决方案 - 因为在这里我们使用Muenchian方法进行分组:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>
 <xsl:key name="kOS" match="tobject.subject" use="@tobject.subject.refnum"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match=
  "tobject.subject[generate-id() != generate-id(key('kOS', @tobject.subject.refnum)[1])]"/>
</xsl:stylesheet>

在提供的XML文档上应用此转换时:

<tobject tobject.type="Utenriks">
    <tobject.property tobject.property.type="Nyheter"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>

产生了想要的正确结果

<tobject tobject.type="Utenriks">
   <tobject.property tobject.property.type="Nyheter"/>
   <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/>
   <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/>
   <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/>
   <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/>
   <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/>
   <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/>
</tobject>

<强> II。单行XPath 2.0表达式,用于选择所需的唯一(每个组元素中的一个

$vElems[index-of($vElems/@tobject.subject.refnum, @tobject.subject.refnum)[1]]

这里$ vElems必须定义为:

/*/tobject.subject

在提供的XML文档上评估此XPath 2.0表达式时,选择了所需的元素序列

<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000"
             tobject.subject.type="økonomi og næringsliv"/>
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000"
             tobject.subject.matter="olje og energi"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000"
             tobject.subject.type="politikk"/>
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000"
             tobject.subject.matter="valg"/>
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000"
             tobject.subject.type="kriminalitet og rettsvesen"/>
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000"
             tobject.subject.type="fritid"/>