查找字符串对

时间:2017-11-14 09:05:59

标签: string algorithm linear-search

我试图练习一些基于字符串和算法的问题,我偶然发现了一个有以下措辞的问题:

你有一个长度为N的字符串x,它由小英文字母组成。您必须找到x中的子字符串S的数量,以便0 <= d < c < b < a <= N - 1x[a] == x[c] and x[b] == x[d]

例如: x = "ababa" 答案是s=2,因为有两个符合上述条件的字符串: ababbaba

解决此问题的最佳方法是什么?

提前致谢。

2 个答案:

答案 0 :(得分:0)

如果条件x[a] == x[c] and x[b] == x[d]为真,我们需要考虑两种情况:

一个x[a]== x[b]即。相同的字符重复4次或更多次,我们称这种情况相似,为了处理这种情况,我们创建了一个存储字符及其频繁的结构,然后当它的频繁&gt; = 4时我们有我们正在寻找的模式。你可以在下面看到它在代码中单独处理。可以使用以下方式以数学方式计算可能的组合数:

C(n,r)=n!/((n−r)!r!) // r=4 here, and n is the frequency of the character.

我们计算所有相似之处,并将一些与非相似性的总和相加。

另一个案例是查找两个不同的字符x[a]!= x[b]

X="arbsatbuavb"    Then n=3 (a after b 3 times); S=n(n-1)/2=3  
X="arbsatbuavbwaxb" Then n=4; S=n(n-1)/2=6

这里我们需要解析数组并查找两个字符的每个不同出现位置,将它们作为字典结构<key,value>的键,其中value表示b after a (but no more Bs counted for that a)的出现次数然后,对于结构中的每个键,我们使用以下数学方法来获得满足条件的非相似性出现的子字符串总数:

S = enter image description here

以下是实施的算法及其结果:

public static void main(String[] args) {
    System.out.println("TOTAL S ="+ calculate("aaaaa"));
}

public static int calculate(String str) {
    int s = 0;
    Map<String, Integer> struct = new HashMap<String, Integer>();
    Map<String, String> indexes = new HashMap<String, String>();
    Map<String, String> similarities = new HashMap<String, String>();
    String[] x = str.split("(?!^)");// convert the string to array.

    //Handle similarities
    for (int i = 0; i < x.length; i++) {
        if (similarities.containsKey(x[i])) {
            similarities.put(x[i], similarities.get(x[i]) + "," + i); // "a": 1,3,7...
        } else {
            similarities.put(x[i], i + "");
        }
    }

    //Ignore similarities
    ArrayList<String> temp = new ArrayList<String>();
    for (int i = 0; i < x.length - 1; i++) {
        temp.clear();// this temp is important otherwise "cdxd" will count
                        // "cd" twice!!!
        for (int j = i + 1; j < x.length; j++) {
            if (!x[i].equals(x[j])) {// for example if "abcdamn" when reach the second a stop j and jump to the next i.
                if (struct.containsKey(x[i] + x[j])) { // NOTE x[i] + x[j] is a String
                    if (!temp.contains(x[j])) {


                        struct.put(x[i] + x[j], struct.get(x[i] + x[j]) + 1);
                        temp.add(x[j]);

                        //Update
                        indexes.put(x[i] + x[j],indexes.get(x[i] + x[j])+ ";"+i+","+j);

                    }
                }

                //UP I have excluded similarities ie. if we have the following "aaaaa" then we have c(5,4) =

                else {

                    struct.put(x[i] + x[j], 1); // NOTE x[i] + x[j] is a String
                    temp.add(x[j]);

                    //Update
                    indexes.put(x[i] + x[j], i+","+j);
                }
            } else {
              break;
            }
        }
    }
    // now compute the result when similarities ignored
    for (Map.Entry<String, Integer> entry : struct.entrySet()) {
        s += entry.getValue() * (entry.getValue() - 1) / 2;
    }


    //Update
    //calculating s taking similarities into account
    int simil=0;
    ArrayList<String> perm=new ArrayList<String>();
    System.out.println("String pairs of '"+str+"' :");
    System.out.println("Similarities ie.(aaaaa)");
    for (Map.Entry<String, String> entry : similarities.entrySet()) {
        if(entry.getValue().split(",").length>=4)
        {
            String[] indxsim=entry.getValue().split(",");
            simil+=factorial(indxsim.length)/(factorial(indxsim.length-4)*factorial(4));/*C(n,r)=n!/(n−r)!r!*/

          //show similarities results:12345=>1234;1235;1245;1345;2345
            for(int i=0;i<indxsim.length-3;i++)
                for(int j=i+1;j<indxsim.length-2;j++)
                    for(int k=j+1;k<indxsim.length-1;k++)
                        for(int l=k+1;l<indxsim.length;l++)
                        {
                            if(!perm.contains(indxsim[i]+indxsim[j]+indxsim[k]+indxsim[l]))//indxsim[i] is String
                            {
                                perm.add(indxsim[i]+indxsim[j]+indxsim[k]+indxsim[l]);
                                System.out.println(indxsim[i]+indxsim[j]+indxsim[k]+indxsim[l]);
                            }
                        }
        }
    }
    //show results by parsing indexes and calculating sub strings
    System.out.println("NON-Similarities (cd*cd*)");

    for (Map.Entry<String, String> entry : indexes.entrySet()) {
        if(entry.getValue().split(",").length>2)
        {
            String[] indx=entry.getValue().split(";");
            for (int i=0;i<indx.length-1;i++)
                for(int j=i+1;j<indx.length;j++)
                {
                    System.out.println(indx[i]+","+indx[j]);
                }
        }
    }
    s+=simil;
    return s;
}

  public static int factorial(int n) {
        if (n == 0) {
            return 1;
        }
        int fact = 1; // this  will be the result
        for (int i = 1; i <= n; i++) {
            fact *= i;
        }
        return fact;
    }

<强>结果:  直接取自程序的输出。 __索引是从零开始的!

    String pairs of 'abababa' :
Similarities ie.(aaaaa)
0246
NON-Similarities (cd*cd*)
0,1,2,3
0,1,4,5
2,3,4,5
1,2,3,4
1,2,5,6
3,4,5,6
TOTAL S =7



    String pairs of 'www.google.com' :
Similarities ie.(aaaaa)
NON-Similarities (cd*cd*)
3,5,10,12
4,5,7,12
TOTAL S =2



 String pairs of 'hellothisisarandomtext' :
Similarities ie.(aaaaa)
NON-Similarities (cd*cd*)
0,4,6,16
0,5,6,18
7,8,9,10
1,5,19,21
4,5,16,18
0,1,6,19
TOTAL S =6






 String pairs of 'ababaaa' :
Similarities ie.(aaaaa)
0245
0246
0256
0456
2456
NON-Similarities (cd*cd*)
0,1,2,3
1,2,3,4
TOTAL S =7

根据评论更新

问候。

答案 1 :(得分:0)

提示:如果f(a)返回ab右侧a组合的数量,则包括f(next-a-to-the-left) = f(a) + count of b's to the right of next-a-to-the-left ,那么:

b

对于每个f(next-a-to-the-right) * count of a's to the left ,都有

MySql

有效组合。