在xslt中查找字符串中子字符串的出现次数

时间:2009-09-22 03:38:24

标签: string xslt count

我正在编写一个脚本来查找XSLT中字符串中子字符串的出现次数。当我想在200多万条记录中遍历它时,花了太多时间。任何人都可以帮我指出一些变化,以使其更快,或以其他方式来获得发生次数?

我说的是子字符串,而不是字符 - 所以我不是在讨论translate()函数。

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:template match="/">
    <Root>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Rohan'"/>
        </xsl:call-template>
      </NoofOccurane>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Sohan'"/>
        </xsl:call-template>
      </NoofOccurane>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Mohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Mohan'"/>
        </xsl:call-template>
      </NoofOccurane>
    </Root>
  </xsl:template>

  <xsl:template name="GetNoOfOccurance">
    <xsl:param name="String"/>
    <xsl:param name="SubString"/>
    <xsl:variable name ="LenString" select="string-length($String)" />
    <xsl:variable name ="LenSubString" select="string-length($SubString)" />
    <xsl:variable name ="ReplaceString">
      <xsl:call-template name="replace-string">
        <xsl:with-param name="text" select="$String"/>
        <xsl:with-param name="replace" select="$SubString"/>
        <xsl:with-param name="with" select="''"/>
      </xsl:call-template>
    </xsl:variable>
    <xsl:variable name ="NewLenString" select="string-length($ReplaceString)" />
    <xsl:variable name ="DiffLens" select ="number($LenString)-number($NewLenString)" />
    <xsl:choose>
      <xsl:when test ="$NewLenString=0 and $LenSubString &gt;0">
        <xsl:value-of select ="1"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select ="number($DiffLens) div number($LenSubString)"/>
      </xsl:otherwise>
    </xsl:choose>    
  </xsl:template>

  <!-- Template to Replace function -->
  <xsl:template name="replace-string">
    <xsl:param name="text"/>
    <xsl:param name="replace"/>
    <xsl:param name="with"/>
    <xsl:choose>
      <xsl:when test="contains($text,$replace)">
        <xsl:value-of select="substring-before($text,$replace)"/>
        <xsl:value-of select="$with"/>
        <xsl:call-template name="replace-string">
          <xsl:with-param name="text" select="substring-after($text,$replace)"/>
          <xsl:with-param name="replace" select="$replace"/>
          <xsl:with-param name="with" select="$with"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$text"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

结果是

<Root>
  <NoofOccurane>3</NoofOccurane>
  <NoofOccurane>0</NoofOccurane>
  <NoofOccurane>1</NoofOccurane>
</Root>

2 个答案:

答案 0 :(得分:2)

我建议:

<xsl:template name="GetNoOfOccurance">
  <xsl:param name="String"/>
  <xsl:param name="SubString"/>
  <xsl:param name="Counter" select="0" />

  <xsl:variable name="sa" select="substring-after($String, $SubString)" />

  <xsl:choose>
    <xsl:when test="$sa != '' or contains($String, $SubString)">
      <xsl:call-template name="GetNoOfOccurance">
        <xsl:with-param name="String"    select="$sa" />
        <xsl:with-param name="SubString" select="$SubString" />
        <xsl:with-param name="Counter"   select="$Counter + 1" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$Counter" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

此XSLT 1.0解决方案通过使用简单直接的递归来计算子字符串的出现次数。不需要进一步的模板,结果是想要的模板:

<Root>
  <NoofOccurane>3</NoofOccurane>
  <NoofOccurane>0</NoofOccurane>
  <NoofOccurane>1</NoofOccurane>
</Root>

您可以删除<xsl:template name="replace-string">并放入我的模板。不需要进一步的代码更改,调用约定是相同的。

答案 1 :(得分:0)

http://www.xsltfunctions.com/xsl/functx_number-of-matches.html

记录在案:

count(tokenize($arg,$pattern)) - 1

我会把它写成:

count(tokenize($string,$substring)) - 1

在你的情况下:

count(tokenize('My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan','Rohan')) - 1

PS:你拼错了'朋友'。

我在XSLT 2.0版本中为自己的用例测试了这个方法,它也可能在1.0中不能确定基于文档。