找到字符串的最长边框

时间:2010-10-22 12:59:58

标签: algorithm string

首先,让我告诉你border of a string是什么,

let x = "abacab"
let y = "ababab"

字符串的边框是一个子字符串,它既是正确的前缀又是字符串的正确后缀 - “正确”意味着整个字符串不算作子字符串。 x的最长边界是“ab”。 y的最长边界是“abab”(前缀和后缀可以重叠)。

另一个例子:
在字符串“ abcde hgrab abcde ”中,“abcde”是前缀和后缀。因此它也是上面字符串中最长的边界。

如何找到字符串的最长边框?

10 个答案:

答案 0 :(得分:19)

查找"字符串的边框"是Knuth-Morris-Pratt算法的prefix function(也称为失效函数)。用c ++实现(this code的一点改动版):

.let{}

Runnable版本:http://ideone.com/hTW8FL

此算法的复杂性为int longestBorder(const string& s) { int len = s.length(); vector<int> prefixFunc(len); prefixFunc[0] = 0; int curBorderLen = 0; for (int i = 1; i < len; ++i) { while (curBorderLen > 0 && s[curBorderLen] != s[i]) curBorderLen = prefixFunc[curBorderLen - 1]; if (s[curBorderLen] == s[i]) ++curBorderLen; prefixFunc[i] = curBorderLen; } return prefixFunc[len-1]; }

答案 1 :(得分:3)

这是一个Java实现,基于边界是正确的子串的假设。 (否则最长的边界就是字符串长度。)

public static int findLongestBorder(String s) {
    int len = s.length();
    for (int i = len - 1; i > 0; i--) {
        String prefix = s.substring(0, i);
        String suffix = s.substring(len - i, len);
        if (prefix.equals(suffix)) {
            return i;
        }
    }
    return 0;
}

可以通过从字符串的字符数组开始然后比较单个字符来稍微优化一下,但算法背后的想法更清晰我写的方式。

答案 2 :(得分:2)

这是一个带有注释的JS解决方案,它使用DAIe mentioned

的前缀函数
function getPrefixBorders(string) {
    // This will contain the border length for each
    // prefix in ascending order by prefix length.
    var borderLengthByPrefix = [0];

    // This is the length of the border on the current prefix.
    var curBorderLength = 0;

    // Loop from the 2nd character to the last.
    for (var i = 1; i < string.length; i++) {

        // As long as a border exists but the character
        // after it doesn't match the current character, 
        while (curBorderLength > 0 && string[curBorderLength] !== string[i])
            // set the border length to the length of the current border's border.
            curBorderLength = borderLengthByPrefix[curBorderLength - 1];

        // If the characters do match,
        if (string[curBorderLength] === string[i])
            // the new border is 1 character longer.
            curBorderLength++;

        // Note the border length of the current prefix.
        borderLengthByPrefix[i] = curBorderLength;
    }

    return borderLengthByPrefix;
}

它返回字符串中每个前缀的最长边界长度(这比要求的要多得多,但它在线性时间内完成)。因此,要获得完整字符串中最长边框的长度:

var string = "ababab";
var borderLengthsByPrefix = getPrefixBorders(); // [0,0,1,2,3,4]
var stringBorderLength = borderLengthsByPrefix[borderLengthsByPrefix.length - 1];

另一个了解其工作原理的重要资源是Coursera上的this video(和之前的那个)。

答案 3 :(得分:1)

要获得最长边框的长度,请执行以下操作:

def get_border_size(astr):
    border = 0
    for i in range(len(astr)):
        if astr[:i] == astr[-i:]:
            border = i
    return border

要获得最长的边界,请:

def get_border(astr):
    border = 0
    for i in range(len(astr)):
        if astr[:i] == astr[-i:]:
            border = astr[:i]
    return border

答案 4 :(得分:1)

我使用<header id="header"> <div id="mySidenav" class="sidenav"> <a href="javascript:void(0)" class="closebtn" onclick="closeNav()">&times;</a> <a href="index.html" class="sidenavtext">Home</a> <a href="about.html" class="sidenavtext">About Us</a> <a href="whatwedo.html" class="sidenavtext">What We Do</a> <a href="getinvolved.html" class="sidenavtext">Get Involved</a> <a href="contactus.php" class="sidenavtext">Contact Us</a> </div> <div class="control"> <div class="col-md-4"> <img src="assets/img/logohome.png" class="pull-left img-responsive logo" alt="SAF Logo"> </div> <div class="col-md-8"> <!-- Use any element to open the sidenav --> <span onclick="openNav()" class="pull-right menu-icon">☰</span> <button type="button" class="pull-right btn btn-danger btn-round donate">DONATE NOW</button> </div> </div> </header> 使用来自Python3模块的Python2Counter使用collections(也适用于max())。

这是我的解决方案:

from collections import Counter

def get_seq(a):
    data = []
    for k in range(1, len(a)):
        data.append(a[:k])
        data.append(a[k:])

    return Counter(data)

def get_max_sublist(a):
    bb = [k for k in a.items() if k[1] > 1]
    try:
        k, j = max(bb, key= lambda x: len(x[0]))
        n, _ = max(a.items(), key= lambda x: x[1])

    except ValueError:
        return None

    else:
        return k if j > 1 else n



seq = ["abacab", "ababab", "abxyab", "abxyba", "abxyzf", "bacab"]

for k in seq:
    j = get_seq(k)
    print("longest border of {} is: {}".format(k, get_max_sublist(j)))

输出:

longest border of abacab is: ab
longest border of ababab is: abab
longest border of abxyab is: ab
longest border of abxyba is: a
longest border of abxyzf is: None
longest border of bacab is: b

答案 5 :(得分:1)

这个简单的解决方案只需一个循环就可以了:

function findLongestBorder($s){
    $a = 0;
    $b = 1;
    $n = strlen($s);

    while($b<$n){
        if($s[$a]==$s[$b]){
            $a++;
        }else{
            $b-= $a;
            $a = 0;
        }
        $b++;
    }

    return substr($s,0,$a);
}

示例:

echo findLongestBorder("abacab")."\n";
echo findLongestBorder("ababab")."\n";
echo findLongestBorder("abcde hgrab abcde")."\n";
echo findLongestBorder("bacab")."\n";
echo findLongestBorder("abacababac")."\n";

输出:

ab
abab
abcde
b
abac

请参阅https://eval.in/812640

答案 6 :(得分:1)

我最近一直在使用大量的JavaScript,所以我用Javascript做了:

&#13;
&#13;
function findBorder() {
  var givenString = document.getElementById("string").value;
  var length = givenString.length;
  var y = length;
  var answer;
  var subS1;
  var subS2;
  for (var x = 0; x < length; x++ ){
    subS1 = givenString.substring(0, x);
    subS2 = givenString.substring(y);
    if(subS2 === subS1){
      answer = subS1;
    }
    y--;
  }
  document.getElementById("answer").innerHTML = answer.toString();
}
&#13;
<h1>put the string in here</h1>

<input type="text" id="string" />
<button id="goButton" onclick="findBorder()">GO</button>


<h3 id="answer"></h3>
&#13;
&#13;
&#13;

答案 7 :(得分:0)

以下是c中的代码实现,用于查找字符串

的最长边框
#include<stdio.h>
#include<string.h>
#include <stdlib.h>
int main()
{
   char str[]="abcdefabcanabcabccabcdef";
   int n,i,j,l,k,count,max=0;
   l=strlen(str);

   for(i=0;i<(l/2);i++)
   {   j=l-1-i;
       k=0;
       count=0;
       while(k<=i&&j<l)
      {
        if(str[k]==str[j])
            count++;
         k++;
         j++;

      }
    if(count==(k))
    {
        if(isasubstring(str,k,l-(2*(k))))
            if(max<count)
                max=count;
    }
}

return 0;
}

FUNCTION:isasubstring,用于从字符串中找到边框的最大宽度和图案。

isasubstring(char *a,int s,int n)
{
int i,j;
char *temp;
char *pattern=malloc(sizeof(char)*(s+1));
char *input =malloc(sizeof(char)*(n+1));
memcpy(pattern,a,s);
pattern[s]='\0';
j=0;
for(i=s;i<=s+n-1;i++)
{
    input[j]=a[i];
    j++;
}
input[j]='\0';
printf("The string between the border :%s\n The longest border is: %s\n",input,pattern);
temp=strstr(input,pattern);
if(temp)
    return 1;
else
    return 0;

}

程序的输出如下: //当输入为abcdefabcanabcabccabcdef

The string between the border :abcanabcabcc
The longest border is: abcdef

答案 8 :(得分:0)

在Perl中实现,使用正则表达式匹配

use strict;
use warnings;
while(<STDIN>)
{
    if ( /^([a-zA-z]+).*\1$/)
    {
            print "Longest Border : $1\n";
    }
    else
    {
            print "No border in the pattern as suffix and prefix\n";
    }
}

该程序将标准输入作为字符串获取并找到该模式。

^ - beginning of the line
$ - end of the line
([a-zA-z]+) - Grouping the pattern which holds in $1 or \1 variable
.* - Match any characters in between the borders.

答案 9 :(得分:-1)

如果您正在谈论字符数组,我想您需要以下内容。这是基于边框是字符串的第一个和最后一个字符。您的示例并不清楚边框是什么。您需要更清楚地定义边界是什么。

x = abcde
border = { x[0], x[length(x)-1) }

如果你需要长度

length(z) {
    return sizeof(z) / (sizeof(z[0])