我有以下XML。
<body>
<p type="Heading 1">My Heading</p>
<p>This is paragraph Text... This is paragraph text... <p type="Key Words">This is a keyword A</p></p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text... <p type="Key Words">This is a keyword B</p></p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p type="Heading 1">My Next Heading</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... <p type="Key Words">This is a keyword C</p>This is paragraph text...</p>
<p type="Heading 2">My Next Heading</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... <p type="Key Words">This is a keyword D</p> This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
</body>
我想将所有“关键词”移到下一个标题之前,如下所示:
<body>
<p type="Heading 1">My Heading</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p type="Key Words">This is a keyword A</p>
<p type="Key Words">This is a keyword B</p>
<p type="Heading 1">My Next Heading</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p type="Key Words">This is a keyword C</p>
<p type="Heading 2">My Next Heading</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p>This is paragraph Text... This is paragraph text...</p>
<p type="Key Words">This is a keyword D</p>
</body>
我有适用的代码,但它有严重的性能问题,因为我在拥有10个成千上万字的文档上运行此转换。以下是我目前的代码。
<!-- Place all keywords in section right before the next heading title. -->
<xsl:template match="p[contains(@type,'Heading')]">
<xsl:variable name="headingCount" >
<xsl:value-of select="count(preceding::p[contains(@type,'Heading')])"/>
</xsl:variable>
<xsl:variable name="precedingKeyWordCount">
<xsl:value-of select="count(preceding::p[contains(@type,'Key Words') and count(preceding::p[contains(@type,'Heading')]) = $headingCount])"/>
</xsl:variable>
<xsl:if test="$precedingKeyWordCount > 0" >
<p type="Key Words">
<xsl:apply-templates select="preceding::p[contains(@type,'Key Words') and count(preceding::p[contains(@type,'Heading')]) = $headingCount]" />
</p>
</xsl:if>
<!-- place original heading -->
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
有没有人知道更有效的方法来实现这一目标?
谢谢。
答案 0 :(得分:1)
以下是使用密钥的示例:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:key name="k1"
match="p/p[@type = 'Key Words']"
use="generate-id(parent::p/following-sibling::p[starts-with(@type, 'Heading')][1])"/>
<xsl:template match="@* | node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
<xsl:apply-templates select="key('k1', '')"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body/p[starts-with(@type, 'Heading')]">
<xsl:apply-templates select="key('k1', generate-id())"/>
<xsl:call-template name="identity"/>
</xsl:template>
<xsl:template match="body/p[not(@type) or not(starts-with(@type, 'Heading'))]">
<xsl:copy>
<xsl:apply-templates select="@* | node()[not(self::p[@type = 'Key Words'])]"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
那应该表现得更好。
答案 1 :(得分:1)
稍微简单的示例(假设Key Words
段落可以完整地复制到输出中):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:key name="byHeading" match="p[@type='Key Words']"
use="generate-id(following::p[starts-with(@type, 'Heading')][1])"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
<xsl:copy-of select="key('byHeading', '')"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[starts-with(@type, 'Heading')]">
<xsl:copy-of select="key('byHeading', generate-id())"/>
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[@type='Key Words']"/>
</xsl:stylesheet>
原始样式表的主要问题是,它需要为每个标题多次查看文档中的每个前面元素。 preceding
和following
轴通常会导致大型文档出现性能问题。
我们可以通过使用key
预先分组来避免此性能问题。