使用h1和id将XML文件拆分为多个文件

时间:2011-02-17 15:18:40

标签: xml xslt

我是一个XSLT菜鸟。我正在将XML文件转换为HTML。生成的文件将采用.inc文件的形式用作服务器端包含。现在,我需要在h1节点拆分XML文件,并使用文件名的h1 id将其写入多个.inc文件(包含每个h1节点之间的所有内容)。 h1 id采用'scriptLabel'的形式。现在,文件拆分好了 - 但只是简单地写了h1本身并忽略了之后的内容。我做错了什么?

以下是XML示例:

`<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document SYSTEM "RRfront150610.dtd">
<document>
  <section charstyle="No Style" pagenum="56" parastyle="Gov-Head-A"
  scriptlabel="Gov-chairman-intro">
    <h1 charstyle="No Style" pagenum="56" parastyle="Gov-Head-A"
    scriptlabel="Gov-chairman-intro">chairman&#8217;s
    introduction</h1>
    <p charstyle="No Style" pagenum="56"
    parastyle="Gov&#8211;Head-B-CI" scriptlabel="">
      <strong charstyle="No Style" pagenum="56"
      parastyle="Gov&#8211;Head-B-CI" scriptlabel="">Lorem ipsum
      dolor sit amet, consectetur adipiscing elit. Morbi et leo
      purus. Maecenas at metus massa. Donec rutrum tortor ac enim
      tincidunt ut posuere purus aliquam.</strong>
    </p>
    <p charstyle="No Style" pagenum="56" parastyle="Gov-Body-CI"
    scriptlabel="">Lorem ipsum dolor sit amet, consectetur
    adipiscing elit. Morbi et leo purus. Maecenas at metus massa.
    Donec rutrum tortor ac enim tincidunt ut posuere purus
    aliquam.</p>
  </section>
</document>`

以下是执行拆分的XSLT:

`<xsl:template match="/">
  <xsl:apply-templates />
</xsl:template>
<xsl:template match="document">
  <xsl:apply-templates />
</xsl:template>
<xsl:template match="h1">
  <xsl:variable name="filename"
  select="concat(@scriptlabel,'.inc')" />
  <xsl:value-of select="$filename" />
  <xsl:result-document href="{$filename}">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:result-document>
</xsl:template>`

2 个答案:

答案 0 :(得分:3)

简短回答,这个样式表:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="section">
        <xsl:for-each-group select="node()" group-starting-with="h1">
            <xsl:result-document href="{@scriptlabel}.inc">
                <xsl:copy-of select="current-group()"/>
            </xsl:result-document>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

序列化此Gov-chairman-intro.inc

<h1 charstyle="No Style" 
    pagenum="56" 
    parastyle="Gov-Head-A" 
    scriptlabel="Gov-chairman-intro"
 >chairman’s     introduction</h1>
<p charstyle="No Style" 
   pagenum="56" 
   parastyle="Gov–Head-B-CI" 
   scriptlabel="">
    <strong charstyle="No Style" 
                 pagenum="56" 
                 parastyle="Gov–Head-B-CI" 
                 scriptlabel=""
          >Lorem ipsum       dolor sit amet, consectetur adipiscing elit. Morbi et leo       purus. Maecenas at metus massa. Donec rutrum tortor ac enim       tincidunt ut posuere purus aliquam.</strong>
</p>
<p charstyle="No Style" 
   pagenum="56" 
   parastyle="Gov-Body-CI" 
   scriptlabel=""
 >Lorem ipsum dolor sit amet, consectetur     adipiscing elit. Morbi et leo purus. Maecenas at metus massa.     Donec rutrum tortor ac enim tincidunt ut posuere purus     aliquam.</p>

注意:按起始sectionh1个孩子进行分组。复制整个当前组。

更新:在没有section孩子的情况下使用h1并且无法启动h1群组。

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="section">
        <xsl:for-each-group select="*" group-adjacent="boolean(self::h1)">
            <xsl:if test="not(current-grouping-key())">
                <xsl:variable name="vMark" select="preceding-sibling::h1[1]"/>
                <xsl:result-document
                     href="{((..|$vMark)/@scriptlabel)[last()]}.inc">
                    <xsl:copy-of select="current-group()|$vMark"/>
                </xsl:result-document>
            </xsl:if>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

使用此输入:

<document>
    <section charstyle="No Style" pagenum="56" parastyle="Gov-Head-A"
             scriptlabel="Gov-chairman-intro">
        <h1 charstyle="No Style" pagenum="56" parastyle="Gov-Head-A"
             scriptlabel="Gov-chairman-intro">chairman&#8217;s
             introduction</h1>
        <p charstyle="No Style" pagenum="56"
           parastyle="Gov&#8211;Head-B-CI" scriptlabel="">
            <strong charstyle="No Style" pagenum="56"
                    parastyle="Gov&#8211;Head-B-CI" scriptlabel=""
             >Lorem ipsum dolor sit amet, consectetur adipiscing elit.
              Morbi et leo purus. Maecenas at metus massa. Donec
              rutrum tortor ac enim tincidunt ut posuere purus
              aliquam.</strong>
        </p>
        <p charstyle="No Style" pagenum="56" parastyle="Gov-Body-CI"
           scriptlabel="">Lorem ipsum dolor sit amet, consectetur
           adipiscing elit. Morbi et leo purus. Maecenas at metus
           massa. Donec rutrum tortor ac enim tincidunt ut posuere
           purus aliquam.</p>
    </section>
    <section charstyle="No Style" pagenum="56" parastyle="Gov-Head-A"
             scriptlabel="Test-no-H1">
        <p charstyle="No Style" pagenum="56"
           parastyle="Gov&#8211;Head-B-CI" scriptlabel="">
            <strong charstyle="No Style" pagenum="56"
                    parastyle="Gov&#8211;Head-B-CI" scriptlabel=""
             >Lorem ipsum dolor sit amet, consectetur adipiscing elit.
              Morbi et leo purus. Maecenas at metus massa. Donec
              rutrum tortor ac enim tincidunt ut posuere purus
              aliquam.</strong>
        </p>
        <p charstyle="No Style" pagenum="56" parastyle="Gov-Body-CI"
           scriptlabel="">Lorem ipsum dolor sit amet, consectetur
           adipiscing elit. Morbi et leo purus. Maecenas at metus
           massa. Donec rutrum tortor ac enim tincidunt ut posuere
           purus aliquam.</p>
    </section>
</document>

正确序列化Gov-chairman-intro.inc

<h1 charstyle="No Style" pagenum="56" parastyle="Gov-Head-A" scriptlabel="Gov-chairman-intro">chairman’s
             introduction</h1><p charstyle="No Style" pagenum="56" parastyle="Gov–Head-B-CI" scriptlabel=""><strong charstyle="No Style" pagenum="56" parastyle="Gov–Head-B-CI" scriptlabel="">Lorem ipsum dolor sit amet, consectetur adipiscing elit.
              Morbi et leo purus. Maecenas at metus massa. Donec
              rutrum tortor ac enim tincidunt ut posuere purus
              aliquam.</strong></p><p charstyle="No Style" pagenum="56" parastyle="Gov-Body-CI" scriptlabel="">Lorem ipsum dolor sit amet, consectetur
           adipiscing elit. Morbi et leo purus. Maecenas at metus
           massa. Donec rutrum tortor ac enim tincidunt ut posuere
           purus aliquam.</p>

Test-no-H1.inc

<p charstyle="No Style" pagenum="56" parastyle="Gov–Head-B-CI" scriptlabel=""><strong charstyle="No Style" pagenum="56" parastyle="Gov–Head-B-CI" scriptlabel="">Lorem ipsum dolor sit amet, consectetur adipiscing elit.
              Morbi et leo purus. Maecenas at metus massa. Donec
              rutrum tortor ac enim tincidunt ut posuere purus
              aliquam.</strong></p><p charstyle="No Style" pagenum="56" parastyle="Gov-Body-CI" scriptlabel="">Lorem ipsum dolor sit amet, consectetur
           adipiscing elit. Morbi et leo purus. Maecenas at metus
           massa. Donec rutrum tortor ac enim tincidunt ut posuere
           purus aliquam.</p>

注意:通过“我是标记吗?”,副本组和在先标记对组附近。

答案 1 :(得分:0)

您在“h1”上的匹配,因此它只将h1放在结果文档中。

您可以重新整理数据,以便拥有......

<section>
  <h1>Content 1</h1>
  <p>...</p>
  <p>...</p>
</section>
<section>
  <h1>Content 2</h1>
  <p>...</p>
  <p>...</p>
</section>

您可以将section标记重命名为您想要的任何内容,以免破坏现有代码。 然后你的xslt看起来像这样

<xsl:template match="section">
  <xsl:variable name="filename"
  select="concat(@scriptlabel,'.inc')" />
  <xsl:value-of select="$filename" />
  <xsl:result-document href="{$filename}">
      <xsl:copy-of select=" ./* " />
  </xsl:result-document>
</xsl:template>