使用xslt将值内部的标记添加为id

时间:2017-06-14 11:08:10

标签: xml xslt

我希望将值内部的标记添加为id并在标题标记内部标记部分编号  我的输入XML文件:

<?xml version="1.0" encoding="UTF-8"?>


<chapter id="d102e3" xml:lang="en-US">

<title outputclass="Chapter_Title">Base Food</title>
<subsection id="d102e11" xml:lang="en-US" outputclass="Heading_1">
<title> § 38.1 Nothing</title>
<body>
<p outputclass="Body_Text">1Y The Act also states that the may undertake a review of the definition of the term.</p>
</body>

<subsection id="d102e20" xml:lang="en-US" outputclass="Heading_2">
<title> § 38.1.1 Proposed Amendments: “Accredited Natural Person”</title>

<body>
<p outputclass="Body_Text">1Y The Act also states that the may undertake a review of the definition of the term.</p>
</body>

</subsection>
</subsection>

</chapter>

我的XSLT编码是

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


  <xsl:output method="xml" indent="yes" omit-xml-declaration="no"></xsl:output>



<xsl:template match="/">

  <xsl:apply-templates></xsl:apply-templates>
</xsl:template>


  <xsl:template match="*[contains(@class,' chapter/chapter ')]">
    <chapter>
      <xsl:apply-templates/>
    </chapter>
  </xsl:template>

  <xsl:template match="subsection">
    <section level="sect{format-number(count(preceding::subsection)+1,'0000')}" id="sect_chap38_38.1" num="38.1">
      <xsl:apply-templates/>
    </section>
  </xsl:template>

  <xsl:template match="*[contains(@class,' topic/p ')]">
    <para>
       <xsl:apply-templates/>
    </para>
  </xsl:template>

  <xsl:template match="*[contains(@class,' topic/title ')]">
    <title>
      <xsl:apply-templates/>
    </title>
   </xsl:template>
  </xsl:stylesheet>


</xsl:stylesheet>

我的输出为

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<chapter>

<title>Base Food</title>
<section level="sect0001" id="sect_chap38_38.1" num="38.1">
<title> § 38.1 Nothing</title>
<para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
<section level="sect0001" id="sect_chap38_38.1" num="38.1">
<title> § 38.1.1 Proposed Amendments: “Accredited Natural Person”</title>

<body>
<para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
</section>
</section>

</chapter>

但我希望在'id'和'num'属性中输出侧标题标签数字格式,并删除标题标签和部分序列中的数字和符号'§38.1',如下所示:

需要输出

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<chapter>

<title>Base Food</title>
<section level="sect1" id="sect_chap38_38.1" num="38.1">
<title>Nothing</title>
<para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
<section level="sect2" id="sect_chap38_38.1.1" num="38.1.1">
<title>Proposed Amendments: “Accredited Natural Person”</title>


<para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
</section>
</section>
<section level="sect1" id="sect_chap38_38.2" num="38.2">
    <title>Nothing1</title>
    <para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
    <section level="sect2" id="sect_chap38_38.2.1" num="38.2.1">
    <title>Proposed Amendments: “Accredited Natural Person”1</title>

    <body>
    <para>1Y The Act also states that the may undertake a review of the definition of the term.</para>
    </section>
    </section>

</chapter>

请帮助我。

先谢谢

1 个答案:

答案 0 :(得分:1)

如果我理解你的问题,我认为你正在寻找这样的事情:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

  <xsl:output method="xml" indent="yes"/>

  <xsl:strip-space elements="*"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="subsection">
    <section>
      <xsl:attribute name="level">
        <xsl:number level="any" count="subsection" format="0000"/>
      </xsl:attribute>

      <xsl:apply-templates select="title" mode="id-num"/>
      <xsl:apply-templates/>
    </section>
  </xsl:template>

  <xsl:template match="title" mode="id-num">
    <xsl:variable name="num">
      <xsl:analyze-string select="." regex="\s*§\s(\d.+?)\s">
        <xsl:matching-substring>
          <xsl:value-of select="regex-group(1)"/>
        </xsl:matching-substring>
      </xsl:analyze-string>
    </xsl:variable>

    <xsl:attribute name="id" select="
      concat('sect_chap', tokenize($num, '\.')[1], '_', $num)
    "/>

    <xsl:attribute name="num" select="$num"/>
  </xsl:template>

  <xsl:template match="title/text()">
    <xsl:value-of select="replace(., '\s*§\s\d.+?\s(.*)', '$1')"/>
  </xsl:template>

  <xsl:template match="p">
    <para>
      <xsl:apply-templates/>
    </para>
  </xsl:template>

</xsl:stylesheet>

您可能需要稍微调整一下regexp,具体取决于您的实际内容。