如何使用xsl:key获取元素的唯一结构

时间:2017-02-09 07:12:50

标签: xml xslt xslt-2.0

请建议,如何使用xsl:key避免重复元素列表(我从变量方法得到了结果,但它不是一种有效的方法)。请建议。

在我的输入中,'参考'是主要元素,它有几个后代。只需要列出' Ref'元素的结构(只有元素名称,而不是内容)是唯一的。如果< Ref>< a> 1< / a>< b> 3< / b>< / Ref>和< Ref>< a> 1001< / a>< b> 2001< / b>< / Ref>,然后只有First< Ref>应该显示。在给定的输入中,忽略' au'并且' ed'元素作为他们的祖先。

输入XML:

<article>
<Ref id="ref1">
    <RefText>
        <authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2016</Year><vol>1</vol>
        <fpage>12</fpage><lpage>14</lpage>
    </RefText></Ref><!-- should list -->

<Ref id="ref2">
    <RefText>
        <authors><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><vol>2</vol>
        <fpage>22</fpage><lpage>24</lpage>
        </RefText></Ref><!-- This Ref should not list in output xml, because 'authors, articleTitle, like other same type elements present, ref2 is same as ref1. -->

<Ref id="ref3">
    <RefText>
        <authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
        </RefText></Ref><!-- It should list, bcs, 'vol' missing here, then it is unique in structure with respect to prev Refs -->

<Ref id="ref4">
    <RefText>
        <authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage>
        </RefText></Ref><!-- should list, bcs, 'lpage' missing -->

<Ref id="ref5">
    <RefText>
        <editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage>
        </RefText></Ref><!-- should list, bcs, 'editors' missing -->

<Ref id="ref6">
    <RefText>
        <editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year>
        </RefText></Ref><!-- should list -->

<Ref id="ref7">
    <RefText>
        <editors><ed><snm>Vivan</snm><fnm>S</fnm></ed></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year>
        </RefText></Ref><!-- should not, same type elements in ref6 and ref7 -->

<Ref id="ref8">
    <RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage>
        </RefText></Ref><!-- should not, bcs, 'Ref5 and Ref8' are having same elements -->

</article>

XSLT 2.0: 在这里,我考虑过存储Ref的后代名称的变量。

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

<xsl:template match="@*|node()">
    <xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:template>

<xsl:template match="article">
    <article>

        <xsl:for-each select="descendant::Ref">
            <xsl:variable name="varPrev">
            <xsl:for-each select="preceding::Ref">
                <a>
                    <xsl:text>|</xsl:text>
                        <xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
                            <xsl:value-of select="name()"/>
                        </xsl:for-each>
                    <xsl:text>|</xsl:text>
                </a>
            </xsl:for-each>
        </xsl:variable>
            <xsl:variable name="varPresent">
                <a>
                    <xsl:text>|</xsl:text>
                        <xsl:for-each select="descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)]">
                            <xsl:value-of select="name()"/>
                        </xsl:for-each>
                    <xsl:text>|</xsl:text>
                </a>
            </xsl:variable>
            <xsl:if test="not(contains($varPrev, $varPresent))">
                <xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
            </xsl:if>

        </xsl:for-each>
    </article>
</xsl:template>

<!--xsl:key name="keyRef" match="Ref" use="descendant::*"/>

<xsl:template match="article">
    <xsl:for-each select="descendant::Ref">
        <xsl:if test="count('keyRef', ./name())=1">
            <xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
        </xsl:if>
    </xsl:for-each>
</xsl:template-->

</xsl:stylesheet>

必填结果:

<article>
<Ref id="ref1">
    <RefText>
        <authors><au><snm>Kishan</snm><fnm>TR</fnm></au><au><snm>Rudramuni</snm><fnm>TP</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2016</Year><vol>1</vol>
        <fpage>12</fpage><lpage>14</lpage>
    </RefText></Ref>
<Ref id="ref3">
    <RefText>
        <authors><au><snm>Likhith</snm><fnm>MD</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage><lpage>24</lpage>
        </RefText></Ref>
<Ref id="ref4">
    <RefText>
        <authors><au><snm>Kowshik</snm><fnm>MD</fnm></au></authors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage>
        </RefText></Ref>
<Ref id="ref5">
    <RefText><editors><au><snm>Dhyan</snm><fnm>MD</fnm></au></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year><fpage>22</fpage>
        </RefText></Ref>
<Ref id="ref6">
    <RefText>
        <editors><ed><snm>Kishan</snm><fnm>TR</fnm></ed></editors>
        <artTitle>The article1</artTitle><jTitle>Journal title</jTitle>
        <Year>2017</Year>
        </RefText></Ref>
</article>

2 个答案:

答案 0 :(得分:1)

尝试使用与字符串比较类似的计算键:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.com/mf" exclude-result-prefixes="mf xs">

    <xsl:function name="mf:fingerprint" as="xs:string">
        <xsl:param name="input-element" as="element()"/>
        <xsl:value-of select="for $d in $input-element/descendant::*[not(ancestor-or-self::au) and not(ancestor-or-self::ed)] return node-name($d)" separator="|"/>
    </xsl:function>

    <xsl:key name="group" match="Ref" use="mf:fingerprint(.)"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="Ref[not(. is key('group', mf:fingerprint(.))[1])]"/>
</xsl:transform>

就我所知,它似乎在http://xsltransform.net/bwdwsC完成了工作,但我不太确定名称的字符串连接是否足以处理所有类型的输入。

答案 1 :(得分:1)

我建议采用以下方法:

  • 删除authorseditors的后代以及所有文本节点;

  • 使用deep-equal()比较剩余的节点。

这是一个简化的概念验证:

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/article">
    <xsl:variable name="first-pass">
        <xsl:apply-templates mode="first-pass"/>
    </xsl:variable>
    <xsl:copy>
        <xsl:for-each select="$first-pass/Ref[not(some $ref in preceding-sibling::Ref satisfies deep-equal(RefText, $ref/RefText))]">
            <Ref id="{@id}"/>
        </xsl:for-each>
    </xsl:copy>
</xsl:template>

<!-- identity transform -->
<xsl:template match="@*|node()" mode="#all">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()" mode="#current"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="authors | editors" mode="first-pass">
    <xsl:copy/>
</xsl:template>

<xsl:template match="text()" mode="first-pass" priority="0"/>

</xsl:stylesheet>

<强>结果

<?xml version="1.0" encoding="UTF-8"?>
<article>
   <Ref id="ref1"/>
   <Ref id="ref3"/>
   <Ref id="ref4"/>
   <Ref id="ref5"/>
   <Ref id="ref6"/>
</article>