在字符串中查找子字符串

时间:2014-02-20 19:58:39

标签: java string algorithm

我试图找到给定字符串中的子串数。目前,我的代码没有考虑重叠字符串。

例如

substr =“cde” str =“cdcde”

我的代码:

public static int ssCount(String str, String substr) {
    int count = 0;
    int strlen = str.length();
    int substrlen = substr.length();
    int numsubstr = 0;
    int substrpointer = 0;

    for (int i = 0; i < strlen; i++) {
        if (str.charAt(i) == substr.charAt(substrpointer)) {
            substrpointer++;
            count++;
        }
        else {
            count = 0;
            substrpointer = 0;
        }
        if (count == substrlen) {
            numsubstr++;
            count = 0;
        }
    }
    return numsubstr;
    }

我的尝试:

public static int ssCount(String str, String substr) {
        int count = 0;
        int strlen = str.length();
        int substrlen = substr.length();
        int numsubstr = 0;
        int substrpointer = 0;
        int firstchar = 0;

        for (int i = 0; i < strlen; i++) {
            if (str.charAt(i) == substr.charAt(substrpointer)) {
                substrpointer++;
                count++;
                if (str.charAt(i) == substr.charAt(0)) {
                    firstchar = i;
                }
            }
            else {
                count = 0;
                substrpointer = 0;
                i = firstchar;
            }
            if (count == substrlen) {
                numsubstr++;
                count = 0;
            }
        }
        return numsubstr;
    }

我尝试添加第二个指针,该指针将指向下一个出现的子字符串的第一个字符,以便继续从该点进行比较。但是我遇到了麻烦,因为我可能遇到一些无限循环。

2 个答案:

答案 0 :(得分:2)

这会在较大的字符串中查找所有重叠的子字符串。正则表达式之后是非正则表达式。一个有趣的问题。

import  java.util.regex.Pattern;
import  java.util.regex.Matcher;

 /**
    <P>{@code java OverlappingSubstringsXmpl}</P>
  **/
 public class OverlappingSubstringsXmpl  {
    public static final void main(String[] igno_red)  {
      String sToFind = "cdc";
      String sToSearch = "cdcdcdedcdc";

      System.out.println("Non regex way:");

         int iMinIdx = 0;
         while(iMinIdx <= (sToSearch.length() - sToFind.length()))  {
            int iIdxFound = sToSearch.indexOf(sToFind, iMinIdx);

            if(iIdxFound == -1)  {
               break;
            }

             System.out.println(sToFind + " found at index " + iIdxFound);

            iMinIdx = iIdxFound + 1;
         }

      System.out.println("Regex way:");

         Matcher m = Pattern.compile(sToFind, Pattern.LITERAL).matcher(sToSearch);
         boolean bFound = m.find();
         while (bFound) {
            System.out.println(sToFind + " found at index " + m.start());
            bFound = m.find(m.start() + 1);
         }
   }
}

输出:

[C:\java_code\]java OverlappingSubstringsXmpl
Non regex way:
cdc found at index 0
cdc found at index 2
cdc found at index 8
Regex way:
cdc found at index 0
cdc found at index 2
cdc found at index 8

答案 1 :(得分:1)

不确定您的问题是什么,可能是如何修复您的代码,但我的建议是研究解决此问题的标准方法,例如KMP算法。它也有效地考虑了重叠。