Question

我试图解决这个问题虽然使用暴力我能够解决它，但是以下优化的算法给了我一些测试用例的不正确的结果。我试过但不能找到代码的问题，任何人都可以帮助我。

问题： 给定字符串S和整数K，找到整数C，其等于子串对（S1，S2）的数量，使得S1和S2具有相等的长度，并且不匹配（S1，S2）<= K，其中不匹配函数是定义如下。

不匹配功能

不匹配（s1，s2）是S1和S2中的字符不同的位置数。例如，不匹配（包，男孩）= 2（在第二和第三位置存在不匹配），不匹配（猫，牛）= 2（再次，在第二和第三位置存在不匹配），Mismatch（伦敦， Mumbai）= 6（因为两个字符串中每个位置的字符都不同）。伦敦的第一个角色是'L'，而孟买的第一个角色是'M'，伦敦的第二个角色是'o'，而孟买的是'u' - 依此类推。

int main() {

int k;
char str[6000];
cin>>k;
cin>>str;
int len=strlen(str);
int i,j,x,l,m,mismatch,count,r;

count=0;

 for(i=0;i<len-1;i++)
   for(j=i+1;j<len;j++)
   {  mismatch=0;
     for(r=0;r<len-j+i;r++)
   {  

       if(str[i+r]!=str[j+r])
         { ++mismatch;
           if(mismatch>=k)break;
         }
    if(mismatch<=k)++count;
   } 
  }
cout<<count;
return 0;
}

示例测试用例

测试用例（传递上述代码）
```
**input** 
0
abab

**output** 
3
```

测试用例（以上代码失败）

**input** 
3
hjdiaceidjafcchdhjacdjjhadjigfhgchadjjjbhcdgffibeh

**expected output**
4034

**my output**
4335

Answer 1

您有两个错误。首先，

for(r=1;r<len;r++)

应该是

for(r=1;r<=len-j;r++)

否则，

str[j+r]

在某些时候会开始比较超过null终止符的字符（即超出字符串的结尾）。最大r可以是从j索引到最后一个字符的剩余字符数。

第二，写作

str[i+r]

和

str[j+r]

跳过i和j个字符的比较，因为r始终至少1。你应该写

for(r=0;r<len-j;r++)

Answer 2

您有两个基本错误。当不匹配＆gt; = k而不是不匹配＆gt; k（不匹配== k是可接受的数字）并且让r变得太大时，你正在退出。这些在相反方向上扭曲了最终计数，但是，如您所见，第二个错误“获胜”。

真正的内循环应该是：

for (r=0; r<len-j; ++r)
{
     if (str[i+r] != str[j+r])
     {
           ++mismatch;
           if (mismatch > k)
                break;
      }
      ++count;
 }

r是子字符串的索引，j + r必须小于len才能对右子字符串有效。由于i

此外，您希望在不匹配＆gt; k时中断，而不是在＆gt; = k上，因为允许k不匹配。

接下来，如果在增加不匹配后测试过多的不匹配，则在计数之前不必再次测试。

最后，r

注意：您询问了更快的方法。有一个，但编码更多涉及。使外部循环在起始索引值之间的差异增量上。（0＆lt; delta＆lt; len）然后，用以下内容计算所有可接受的匹配：

count = 0;
for delta = 1 to len-1
    set i=0; j=delta; mismatches=0; r=0; 
    while j < len
        .. find k'th mismatch, or end of str:
        while mismatches < k and j+r<len
            if str[i+r] != str[j+r] then mismatches=mismatches+1
            r = r+1
        end while
        .. extend r to cover any trailing matches:
        while j+r<len and str[i+r]==str[j+r]
            r + r+1
        end while

        .. arrive here with r being the longest string pair starting at str[i]
        .. and str[j] with no more than k mismatches. This loop will add (r) 
        .. to the count and advance i,j one space to the right without recounting
        .. the character mismatches inside.  Rather, if a mismatch is dropped off
        .. the front, then mismatches is decremented by 1.
        repeat
            count = count + r
            if str[i] != str[j] then mismatches=mismatches-1
            i = i+1, j = j+1, r = r-1
        until mismatches < k
    end if
end while

那是伪代码，也是伪代码。一般的想法是比较所有子串，其中起始索引在一次通过，开始和左侧相差△，并且增加子串长度r直到达到源串的末尾或者看到k + 1个不匹配。也就是说，str [j + r]要么是字符串的结尾，要么是右子字符串中的驼峰向后突破的不匹配位置。这使得r子串具有从str [i]和str [j]开始的k或更少的不匹配。

因此，计算这些r子串并移动到下一个位置i = i + 1，j = j + 1和新长度r = r-1，如果从左侧丢弃不相等的字符，则减少不匹配计数。

应该很容易看出，在每个循环中，r增加1或j增加1并且（j + r）保持不变。 j和（j + r）都会在O（n）时间内到达len，所以整个事情都是O（n ^ 2）。

编辑：我修复了r的处理，所以上面应该更加错误。对O（n ^ 2）运行时的改进可能有所帮助。

重新编辑：修正了评论错误。重新编辑：算法中出现更多拼写错误，大多数错配拼写错误并且增加2而不是1。

Answer 3

@Mike我的逻辑有一些修改，这里有正确的代码......

#include<iostream>
#include<string>
using namespace std;
int main()
{
long long int k,c=0;
string s;
cin>>k>>s;
int len = s.length();
for(int gap = 1 ; gap < len; gap ++)
{
    int i=0,j=gap,mm=0,tmp_len=0; 


        while (mm <=k && (j+tmp_len)<len)
        {
            if (s[i+tmp_len] != s[j+tmp_len])
                mm++;
            tmp_len++;
        }
       // while (((j+tmp_len)<len) && (s[i+tmp_len]==s[j+tmp_len]))
         //   tmp_len++;
        if(mm>k){tmp_len--;mm--;} 
        do{
            c = c + tmp_len ;
            if (s[i] != s[j]) mm--;

                i++;
                j++;

            tmp_len--;
            while (mm <=k && (j+tmp_len)<len)
            {
            if (s[i+tmp_len] != s[j+tmp_len])
                mm++;
            tmp_len++;
            }
            if(mm>k){tmp_len--;mm--;} 
        }while(tmp_len>0);

}
cout<<c<<endl;
return 0;
}

至少K不匹配子串？

3 个答案: