使用XSLT拆分字符串

时间:2013-05-24 22:19:58

标签: xml xslt

我正在使用XSLT将XML文件转换为Excel可以分隔的格式(稍后显示示例代码)。例如,在Excel中打开时,分隔的版本可能类似于:

+---------------+---------------+----------+
|URL            |Title          | Version  |
+---------------+---------------+----------+
|dogs_are_cool  |Dogs are cool  | May 2013 |
+---------------+---------------+----------+

问题与每个URL都在末尾附加了版本有关。使用前面的示例,dogs_are_cool实际上是dogs_are_cool_may2013.html

我想用附加版本做两件事:

  • 打印网址时删除版本。
  • 重新格式化并打印版本。

我猜测最好的方法是通过某种方式拆分下划线上的URL。然后将最后一个元素拆分为一个变量并按顺序打印其他元素 - 将下划线插回。

我不知道该怎么做。

示例XML:

<contents Url="toc_animals_may2013.html" Title="Animals">
    <contents Url="toc_apes_may2013.html" Title="Apes">
        <contents Url="chimps_may2013.html" Title="Some Stuff About Chimps" />
    </contents>
    <contents Url="toc_cats" Title="Cats">
        <contents Url="hairless_cats_may2013.html" Title="OMG Where Did the Hair Go?"/>
        <contents Url="wild_cats_may2013.html" Title="These Things Frighten Me"/>
    </contents>
    <contents Url="toc_dogs_may2013.html" Title="Dogs">
        <contents Url="toc_snorty_dogs_may2013.html" Title="Snorty Dogs">
            <contents Url="boston_terriers_may2013.html" Title="Boston Terriers" />
            <contents Url="french_bull_dogs_may2013.html" Title="Frenchies" />
        </contents>
    </contents>
</contents>

示例XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" indent="no"/>

    <!-- This variable sets the delimiter symbol that Excel will use to seperate the cells -->
    <xsl:variable name="delimiter">@</xsl:variable>

    <xsl:template match="contents">

        <!-- Prints the URL -->
        <xsl:value-of select="@Url"/>
        <xsl:copy-of select="$delimiter" />

        <!-- Prints the title -->
        <xsl:apply-templates select="@Title"/>
        <xsl:copy-of select="$delimiter" />

        <!-- I'd like to print the version here -->
        <xsl:copy-of select="$delimiter" />

    <xsl:template match="/">
        <xsl:apply-templates select="//contents"/>
    </xsl:template>

</xsl:stylesheet>

2 个答案:

答案 0 :(得分:2)

如果您可以使用XSLT 2.0,它会变得更加简单。

XML输入

<contents Url="toc_animals_may2013.html" Title="Animals">
    <contents Url="toc_apes_may2013.html" Title="Apes">
        <contents Url="chimps_may2013.html" Title="Some Stuff About Chimps" />
    </contents>
    <contents Url="toc_cats" Title="Cats">
        <contents Url="hairless_cats_may2013.html" Title="OMG Where Did the Hair Go?"/>
        <contents Url="wild_cats_may2013.html" Title="These Things Frighten Me"/>
    </contents>
    <contents Url="toc_dogs_may2013.html" Title="Dogs">
        <contents Url="toc_snorty_dogs_may2013.html" Title="Snorty Dogs">
            <contents Url="boston_terriers_may2013.html" Title="Boston Terriers" />
            <contents Url="french_bull_dogs_may2013.html" Title="Frenchies" />
        </contents>
    </contents>
</contents>

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:param name="delim" select="'@'"/>

    <xsl:template match="contents">
        <xsl:variable name="urlTokens" select="tokenize(@Url,'_')"/>
        <xsl:value-of select="$urlTokens[not(position() = last())]" separator="_"/>
        <xsl:value-of select="$delim"/>
        <xsl:value-of select="concat(@Title,$delim)"/>
        <xsl:analyze-string select="$urlTokens[last()]" regex="([a-z])([a-z]+)([0-9]+)">
            <xsl:matching-substring>
                <xsl:value-of select="concat(upper-case(regex-group(1)),regex-group(2),' ',regex-group(3))"/>               
            </xsl:matching-substring>
        </xsl:analyze-string>
        <xsl:text>&#xA;</xsl:text>
        <xsl:apply-templates/>
    </xsl:template>

</xsl:stylesheet>

<强>输出

toc_animals@Animals@May 2013
toc_apes@Apes@May 2013
chimps@Some Stuff About Chimps@May 2013
toc@Cats@
hairless_cats@OMG Where Did the Hair Go?@May 2013
wild_cats@These Things Frighten Me@May 2013
toc_dogs@Dogs@May 2013
toc_snorty_dogs@Snorty Dogs@May 2013
boston_terriers@Boston Terriers@May 2013
french_bull_dogs@Frenchies@May 2013

答案 1 :(得分:1)

添加一些模板来帮助我们,我们创建了一个XSLT野兽,但它似乎可以解决这个问题......

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text" indent="no"/>
  <!-- This variable sets the delimiter symbol that Excel will use to seperate the cells -->
  <xsl:variable name="delimiter">@</xsl:variable>

  <xsl:template match="contents">
    <!-- Prints the URL -->
    <xsl:choose>
      <xsl:when test="contains(@Url, '.')">
        <xsl:call-template name="substring-before-last">
          <xsl:with-param name="list" select="@Url"/>
          <xsl:with-param name="delimiter" select="'_'"/>
        </xsl:call-template>            
      </xsl:when>
      <xsl:otherwise><xsl:value-of select="@Url"/></xsl:otherwise>
    </xsl:choose>
    <xsl:copy-of select="$delimiter"/>

    <!-- Prints the title -->
    <xsl:apply-templates select="@Title"/>
    <xsl:copy-of select="$delimiter"/>

    <!-- Now do all the tricks to format the version -->
    <xsl:variable name="withExtension">
      <xsl:call-template name="substring-after-last">
        <xsl:with-param name="string" select="@Url"/>
        <xsl:with-param name="delimiter" select="'_'"/>
      </xsl:call-template>
    </xsl:variable>

    <xsl:variable name="withoutExtension">
      <xsl:call-template name="substring-before-last">
        <xsl:with-param name="list" select="$withExtension"/>
        <xsl:with-param name="delimiter" select="'.'"/>
      </xsl:call-template>
    </xsl:variable>

    <xsl:variable name="withoutSpace">
      <xsl:value-of select="concat(translate(substring($withoutExtension, 1, 1), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'), substring($withoutExtension, 2))"/>
    </xsl:variable>

    <xsl:variable name="year">
      <xsl:value-of select="translate($withoutSpace,translate($withoutSpace, '0123456789', ''), '')"/>
    </xsl:variable>

    <xsl:value-of select="concat(substring-before($withoutSpace, $year), ' ', $year)"/>
    <xsl:copy-of select="$delimiter"/>
  </xsl:template>

  <xsl:template match="/">
    <xsl:apply-templates select="//contents"/>
  </xsl:template>

  <xsl:template name="substring-before-last">
    <xsl:param name="list"/>
    <xsl:param name="delimiter"/>
    <xsl:choose>
      <xsl:when test="contains($list, $delimiter)">
        <xsl:value-of select="substring-before($list,$delimiter)"/>
        <xsl:choose>
          <xsl:when test="contains(substring-after($list,$delimiter),$delimiter)">
            <xsl:value-of select="$delimiter"/>
          </xsl:when>
        </xsl:choose>
        <xsl:call-template name="substring-before-last">
          <xsl:with-param name="list" select="substring-after($list,$delimiter)"/>
          <xsl:with-param name="delimiter" select="$delimiter"/>
        </xsl:call-template>
      </xsl:when>
    </xsl:choose>
  </xsl:template>

  <xsl:template name="substring-after-last">
    <xsl:param name="string"/>
    <xsl:param name="delimiter"/>
    <xsl:choose>
      <xsl:when test="contains($string, $delimiter)">
        <xsl:call-template name="substring-after-last">
          <xsl:with-param name="string" select="substring-after($string, $delimiter)"/>
          <xsl:with-param name="delimiter" select="$delimiter"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$string"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

<强>输出:

toc_animals@Animals@May 2013@toc_apes@Apes@May 2013@chimps@Some Stuff About Chimps@May 2013@toc_cats@Cats@ @hairless_cats@OMG Where Did the Hair Go?@May 2013@wild_cats@These Things Frighten Me@May 2013@toc_dogs@Dogs@May 2013@toc_snorty_dogs@Snorty Dogs@May 2013@boston_terriers@Boston Terriers@May 2013@french_bull_dogs@Frenchies@May 2013@