Question

我有以下XML。

 <body>
  <p type="Heading 1">My Heading</p>
  <p>This is paragraph Text... This is paragraph text... <p type="Key Words">This is a keyword A</p></p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text... <p type="Key Words">This is a keyword B</p></p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p type="Heading 1">My Next Heading</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... <p type="Key Words">This is a keyword C</p>This is paragraph text...</p>
  <p type="Heading 2">My Next Heading</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... <p type="Key Words">This is a keyword D</p> This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
</body>

我想将所有“关键词”移到下一个标题之前，如下所示：

<body>
  <p type="Heading 1">My Heading</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p type="Key Words">This is a keyword A</p>
  <p type="Key Words">This is a keyword B</p>
  <p  type="Heading 1">My Next Heading</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p type="Key Words">This is a keyword C</p>
  <p  type="Heading 2">My Next Heading</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p>This is paragraph Text...  This is paragraph text...</p>
  <p>This is paragraph Text... This is paragraph text...</p>
  <p type="Key Words">This is a keyword D</p>
</body>

我有适用的代码，但它有严重的性能问题，因为我在拥有10个成千上万字的文档上运行此转换。以下是我目前的代码。

<!-- Place all keywords in section right before the next heading title. -->
<xsl:template match="p[contains(@type,'Heading')]">

  <xsl:variable name="headingCount" >
    <xsl:value-of select="count(preceding::p[contains(@type,'Heading')])"/>
  </xsl:variable>

  <xsl:variable name="precedingKeyWordCount">
    <xsl:value-of select="count(preceding::p[contains(@type,'Key Words') and count(preceding::p[contains(@type,'Heading')]) = $headingCount])"/>
  </xsl:variable>


  <xsl:if test="$precedingKeyWordCount > 0" >
    <p type="Key Words">
      <xsl:apply-templates select="preceding::p[contains(@type,'Key Words') and count(preceding::p[contains(@type,'Heading')]) = $headingCount]" />
    </p>
  </xsl:if>


  <!-- place original heading -->
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>


</xsl:template>

有没有人知道更有效的方法来实现这一目标？

谢谢。

Answer 1

以下是使用密钥的示例：

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>

  <xsl:key name="k1" 
    match="p/p[@type = 'Key Words']"
    use="generate-id(parent::p/following-sibling::p[starts-with(@type, 'Heading')][1])"/>

  <xsl:template match="@* | node()" name="identity">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="body">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
      <xsl:apply-templates select="key('k1', '')"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="body/p[starts-with(@type, 'Heading')]">
    <xsl:apply-templates select="key('k1', generate-id())"/>
    <xsl:call-template name="identity"/>
  </xsl:template>

  <xsl:template match="body/p[not(@type) or not(starts-with(@type, 'Heading'))]">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()[not(self::p[@type = 'Key Words'])]"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

那应该表现得更好。

Answer 2

稍微简单的示例（假设Key Words段落可以完整地复制到输出中）：

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:strip-space elements="*"/>
    <xsl:output indent="yes"/>
    <xsl:key name="byHeading" match="p[@type='Key Words']" 
             use="generate-id(following::p[starts-with(@type, 'Heading')][1])"/>
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="body">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
            <xsl:copy-of select="key('byHeading', '')"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="p[starts-with(@type, 'Heading')]">
        <xsl:copy-of select="key('byHeading', generate-id())"/>
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="p[@type='Key Words']"/>
</xsl:stylesheet>

原始样式表的主要问题是，它需要为每个标题多次查看文档中的每个前面元素。 preceding和following轴通常会导致大型文档出现性能问题。

我们可以通过使用key预先分组来避免此性能问题。

如何通过XSLT移动元素

2 个答案: