如何在XSLT中每三个大写字母后添加零宽度空格?

时间:2016-02-01 06:46:18

标签: regex xslt-2.0

我希望在XSLT中每隔三个大写字母后添加零宽度空格。 在这里,我想选择文档中的所有文本节点,并在该文本节点中过滤大写单词。

我的XML示例代码是:

<doc>
    <front>
        <lable>this is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </front>
    <middle>
        <lable>this is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </middle>
    <back>
        <lable>This is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </back>
</doc>

我写的XSLT是:

<xsl:template match="*/text()" priority="100">
        <xsl:analyze-string select="." regex="^[A-Z]+">
            <xsl:matching-substring>
                <xsl:variable name="upperWord" select="substring(.,3)"/>
                <xsl:value-of select="concat($upperWord,'&#x200b;')"/>
            </xsl:matching-substring>

            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

我期待的输出:

<doc>
            <front>
                <lable>this is a TES&#x200b;T TEX&#x200b;T</lable>
                <para>​his is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
            </front>
            <front>
                <lable>this is a TES&#x200b;T TEX&#x200b;T</lable>
                <para>​his is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
            </front>
            <front>
                <lable>this is a TES&#x200b;T TEX&#x200b;T</lable>
                <para>​his is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
            </front>
        </doc>

    Output I got:

<doc>
        <front>
            <lable>this is a TEST TEXT</lable>
            <para>​his is a TEST TEXT with UPPER and Lower</para>
        </front>
        <middle>
            <lable>this is a TEST TEXT</lable>
            <para>​his is a TEST TEXT with UPPER and Lower</para>
        </middle>
        <back>
            <lable>​his is a TEST TEXT</lable>
            <para>​his is a TEST TEXT with UPPER and Lower</para>
        </back>
    </doc>

在这里,我无法理解为什么选择单词的大写首字母以及为什么没有选择所有文本节点。 有人可以帮我解决这个问题.. 感谢..

2 个答案:

答案 0 :(得分:0)

尝试替换

regex="^[A-Z]+"

regex="[A-Z]+"

答案 1 :(得分:0)

您应该可以通过删除xsl:analyze-string并使用replace()来简化它。

注意:在我的示例中,我使用xsl:character-map来保留文本中的实体。您可以删除它,并插入实际字符。

示例......( working version here

XML输入

<doc>
    <front>
        <lable>this is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </front>
    <middle>
        <lable>this is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </middle>
    <back>
        <lable>This is a TEST TEXT</lable>
        <para>This is a TEST TEXT with UPPER and Lower</para>
    </back>
</doc>

XSLT 2.0

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" use-character-maps="chars"/>
  <xsl:strip-space elements="*"/>

  <xsl:character-map name="chars">
    <xsl:output-character character="&#x200b;" string="&amp;#x200b;"/>
  </xsl:character-map>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="text()" priority="1">
      <xsl:value-of select="replace(.,'([A-Z]{3})','$1&#x200b;')"/>
  </xsl:template>

</xsl:stylesheet>

XML输出

<doc>
   <front>
      <lable>this is a TES&#x200b;T TEX&#x200b;T</lable>
      <para>This is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
   </front>
   <middle>
      <lable>this is a TES&#x200b;T TEX&#x200b;T</lable>
      <para>This is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
   </middle>
   <back>
      <lable>This is a TES&#x200b;T TEX&#x200b;T</lable>
      <para>This is a TES&#x200b;T TEX&#x200b;T with UPP&#x200b;ER and Lower</para>
   </back>
</doc>