$string1 = "peachbananaapplepear";
$string2 = "juicenaapplewatermelonpear";
我想知道包含单词" apple"的最长公共子字符串是什么。
$string2 =~ m/.+apple.+/;
print $string2;
所以我使用匹配运算符和.+
来匹配关键字" apple"之前和之后的任何字符。当我打印$string2
时,它不会返回naapple
,而是会返回原来的$string2
。
答案 0 :(得分:1)
这是一种方法。首先得到苹果'的位置。出现在字符串中。对于string1中的每个位置,请查看string2中的所有位置。 向左和向右看,看看共性从初始位置延伸到多远。
$string1 = "peachbananaapplepear12345applegrapeapplebcdefghijk";
$string2 = "juicenaapplewatermelonpearkiwi12345applebcdefghijkberryapple";
my $SearchFor="apple";
my $SearchStrLen = length($SearchFor);
# Get locations in first string where the search term appears
my @FirstPositions = getPostions($string1);
# Get locations in second string where the search term appears
my @SecondPositions = getPostions($string2);
CheckForMaxMatch();
sub getPostions
{
my $GivenString = shift;
my @Positions;
my $j=0;
for (my $i=0; $i < length($GivenString); $i += ($SearchStrLen+1) )
{
$j = index($GivenString, $SearchFor, $i);
if ($j == -1) {
last;
}
push (@Positions, $j);
$i = $j;
}
return @Positions;
}
sub CheckForMaxMatch
{
my $MaxLeft=0;
# From the location of 'apple', look to the left and right
# to see how far the characters are same
for my $i (@FirstPositions) {
for my $j (@SecondPositions) {
my $LeftMatchPos = getMaxMatch($i, $j, -1);
my $RightMatchPos = getMaxMatch($i, $j, 1);
if ( ($RightMatchPos - $LeftMatchPos) > ($MaxRight - $MaxLeft) ) {
$MaxLeft = $LeftMatchPos;
$MaxRight = $RightMatchPos;
}
}
}
my $LongestSubString = substr($string1, $MaxLeft, $MaxRight-$MaxLeft);
print "Longest common substring is: $LongestSubString\n";
print "It begins at $MaxLeft and ends at $MaxRight in string1\n";
}
sub getMaxMatch
{
my $i= shift;
my $j= shift;
my $direction= shift;
my $k = ($direction >= 1 ? $SearchStrLen : 0);
my $FirstChar = substr($string1, $i+($k * $direction), 1);
my $SecondChar = substr($string2, $j+($k * $direction), 1);
for ( ; $FirstChar && $SecondChar; $k++ )
{
$FirstChar = substr($string1, $i+($k * $direction), 1);
$SecondChar = substr($string2, $j+($k * $direction), 1);
if ( $FirstChar ne $SecondChar ) {
$direction < 1 ? $k-- : "";
my $pos = ($k ? ($i + $k * $direction) : $i);
return $pos;
}
}
return $i;
}
答案 1 :(得分:0)
=〜运算符不会重新分配$ string2的值。试试这个:
$string2 =~ m/(.+apple.+)/;
my $match = $1;
print $match
答案 2 :(得分:0)
基于general algorithm,但不仅跟踪当前运行的长度(@l
),还跟踪是否包含关键字(@k
)。只有包含关键字的运行才会被视为运行时间最长。
use strict;
use warnings;
use feature qw( say );
sub find_substrs {
our $s; local *s = \shift;
our $key; local *key = \shift;
my @positions;
my $position = -1;
while (1) {
$position = index($s, $key, $position+1);
last if $position < 0;
push @positions, $position;
}
return @positions;
}
sub lcsubstr_which_include {
our $s1; local *s1 = \shift;
our $s2; local *s2 = \shift;
our $key; local *key = \shift;
my @key_starts1 = find_substrs($s1, $key)
or return;
my @key_starts2 = find_substrs($s2, $key)
or return;
my @is_key_start1; $is_key_start1[$_] = 1 for @key_starts1;
my @is_key_start2; $is_key_start2[$_] = 1 for @key_starts2;
my @s1 = split(//, $s1);
my @s2 = split(//, $s2);
my $length = 0;
my @rv;
my @l = ( 0 ) x ( @s1 + 1 ); # Last ele is read when $i1==0.
my @k = ( 0 ) x ( @s1 + 1 ); # Same.
for my $i2 (0..$#s2) {
for my $i1 (reverse 0..$#s1) {
if ($s1[$i1] eq $s2[$i2]) {
$l[$i1] = $l[$i1-1] + 1;
$k[$i1] = $k[$i1-1] || ( $is_key_start1[$i1] && $is_key_start2[$i2] );
if ($k[$i1]) {
if ($l[$i1] > $length) {
$length = $l[$i1];
@rv = [ $i1, $i2, $length ];
}
elsif ($l[$i1] == $length) {
push @rv, [ $i1, $i2, $length ];
}
}
} else {
$l[$i1] = 0;
$k[$i1] = 0;
}
}
}
for (@rv) {
$_->[0] -= $length;
$_->[1] -= $length;
}
return @rv;
}
{
my $s1 = "peachbananaapplepear";
my $s2 = "juicenaapplewatermelonpear";
my $key = "apple";
for (lcsubstr_which_include($s1, $s2, $key)) {
my ($s1_pos, $s2_pos, $length) = @$_;
say substr($s1, $s1_pos, $length);
}
}
O(NM)中的这个解决方案,意味着它可以很好地扩展(它的功能)。