我正在寻找一种简单的方法来在PHP中找到两个字符串的匹配部分(特别是在URI的上下文中)
例如,考虑两个字符串:
http://2.2.2.2/~machinehost/deployment_folder/
和
/〜machinehost / deployment_folder /用户/鲍勃/设置
我需要的是从第二个字符串中删除这两个字符串的匹配部分,结果是:
用户/鲍勃/设置
在将第一个字符串作为前缀附加之前,形成绝对URI。
是否有一些简单的方法(在PHP中)比较两个任意字符串以匹配其中的子字符串?
编辑:正如所指出的,我的意思是两个字符串共有的最长匹配子字符串
答案 0 :(得分:4)
This就是答案。即用型PHP功能。
答案 1 :(得分:2)
假设您的字符串分别为$a
和$b
,您可以使用此字符:
$a = 'http://2.2.2.2/~machinehost/deployment_folder/';
$b = '/~machinehost/deployment_folder/users/bob/settings';
$len_a = strlen($a);
$len_b = strlen($b);
for ($p = max(0, $len_a - $len_b); $p < $len_b; $p++)
if (substr($a, $len_a - ($len_b - $p)) == substr($b, 0, $len_b - $p))
break;
$result = $a.substr($b, $len_b - $p);
echo $result;
此结果为http://2.2.2.2/~machinehost/deployment_folder/users/bob/settings
。
答案 2 :(得分:1)
使用正则表达式也可以找到最长的共同匹配。
下面的函数将使用两个字符串,使用一个来创建一个正则表达式,然后针对另一个执行它。
/**
* Determine the longest common match within two strings
*
* @param string $str1
* @param string $str2 Two strings in any order.
* @param boolean $case_sensitive Set to true to force
* case sensitivity. Default: false (case insensitive).
* @return string The longest string - first match.
*/
function get_longest_common_subsequence( $str1, $str2, $case_sensitive = false ) {
// First check to see if one string is the same as the other.
if ( $str1 === $str2 ) return $str1;
if ( ! $case_sensitive && strtolower( $str1 ) === strtolower( $str2 ) ) return $str1;
// We'll use '#' as our regex delimiter. Any character can be used as we'll quote the string anyway,
$delimiter = '#';
// We'll find the shortest string and use that to check substrings and create our regex.
$l1 = strlen( $str1 );
$l2 = strlen( $str2 );
$str = $l1 <= $l2 ? $str1 : $str2;
$str2 = $l1 <= $l2 ? $str2 : $str1;
$l = min( $l1, $l2 );
// Next check to see if one string is a substring of the other.
if ( $case_sensitive ) {
if ( strpos( $str2, $str ) !== false ) {
return $str;
}
}
else {
if ( stripos( $str2, $str ) !== false ) {
return $str;
}
}
// Regex for each character will be of the format (?:a(?=b))?
// We also need to capture the last character, but this prevents us from matching strings with a single character. (?:.|c)?
$reg = $delimiter;
for ( $i = 0; $i < $l; $i++ ) {
$a = preg_quote( $str[ $i ], $delimiter );
$b = $i + 1 < $l ? preg_quote( $str[ $i + 1 ], $delimiter ) : false;
$reg .= sprintf( $b !== false ? '(?:%s(?=%s))?' : '(?:.|%s)?', $a, $b );
}
$reg .= $delimiter;
if ( ! $case_sensitive ) {
$reg .= 'i';
}
// Resulting example regex from a string 'abbc':
// '#(?:a(?=b))?(?:b(?=b))?(?:b(?=c))?(?:.|c)?#i';
// Perform our regex on the remaining string
$str = $l1 <= $l2 ? $str2 : $str1;
if ( preg_match_all( $reg, $str, $matches ) ) {
// $matches is an array with a single array with all the matches.
return array_reduce( $matches[0], function( $a, $b ) {
$al = strlen( $a );
$bl = strlen( $b );
// Return the longest string, as long as it's not a single character.
return $al >= $bl || $bl <= 1 ? $a : $b;
}, '' );
}
// No match - Return an empty string.
return '';
}
它会使用两个字符串中较短的一个来生成一个正则表达式,尽管性能很可能是相同的。它可能会错误地将字符串与重复的子字符串匹配,并且我们仅限于匹配两个或更多字符的字符串,除非它们相等或一个是另一个的子字符串。对于实例:
// Works as intended.
get_longest_common_subsequence( 'abbc', 'abc' ) === 'ab';
// Returns incorrect substring based on string length and recurring substrings.
get_longest_common_subsequence( 'abbc', 'abcdef' ) === 'abc';
// Does not return any matches, as all recurring strings are only a single character long.
get_longest_common_subsequence( 'abc', 'ace' ) === '';
// One of the strings is a substring of the other.
get_longest_common_subsequence( 'abc', 'a' ) === 'a';
无论如何,它使用替代方法运行,并且可以改进正则表达式以解决其他情况。
答案 3 :(得分:0)
我不确定理解你的全部要求,但想法是:
设A为您的URL,B为“/〜machinehost / deployment_folder / users / bob / settings”
我还没有测试过,但如果你真的需要,我可以帮助你使这个出色(讽刺)的解决方案有效。
请注意,可以使用像
这样的正则表达式$pattern = "$B(.*?)"
$res = array();
preg_match_all($pattern, $A, $res);
编辑:我认为您的上一条评论会使我的回复无效。但你想要的是找到子串。所以你可以先用一个繁重的算法开始尝试在{2,长度(B)}中找到A中的B [1:i],然后使用一些dynamic programming个东西。
答案 4 :(得分:0)
根据您的要求,它似乎不是一个开箱即用的代码。所以让我们寻找一种简单的方法。
在本次练习中,我使用了两种方法,一种用于找到最长的匹配,另一种用于切断匹配部分。
FindLongestMatch()方法,拆分路径,逐个寻找其他路径中的匹配,只保留一个匹配,最长的匹配(没有数组,没有排序)。 RemoveLongestMatch()方法在找到最长匹配位置后采用后缀或“余数”。
这里是完整的源代码:
<?php
function FindLongestMatch($relativePath, $absolutePath)
{
static $_separator = '/';
$splitted = array_reverse(explode($_separator, $absolutePath));
foreach ($splitted as &$value)
{
$matchTest = $value.$_separator.$match;
if(IsSubstring($relativePath, $matchTest))
$match = $matchTest;
if (!empty($value) && IsNewMatchLonger($match, $longestMatch))
$longestMatch = $match;
}
return $longestMatch;
}
//Removes from the first string the longest match.
function RemoveLongestMatch($relativePath, $absolutePath)
{
$match = findLongestMatch($relativePath, $absolutePath);
$positionFound = strpos($relativePath, $match);
$suffix = substr($relativePath, $positionFound + strlen($match));
return $suffix;
}
function IsNewMatchLonger($match, $longestMatch)
{
return strlen($match) > strlen($longestMatch);
}
function IsSubstring($string, $subString)
{
return strpos($string, $subString) > 0;
}
这是测试用例的代表性子集:
//TEST CASES
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://2.2.2.2/~machinehost/deployment_folder/';
echo "<br>".$relativePath = '/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://1.1.1.1/root/~machinehost/deployment_folder/';
echo "<br>".$relativePath = '/root/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://2.2.2.2/~machinehost/deployment_folder/users/';
echo "<br>".$relativePath = '/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://3.3.3.3/~machinehost/~machinehost/subDirectory/deployment_folder/';
echo "<br>".$relativePath = '/~machinehost/subDirectory/deployment_folderX/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
运行以前的测试用例提供以下输出:
http://2.2.2.2/~machinehost/deployment_folder/
/~machinehost/deployment_folder/users/bob/settings
Longuest match: ~machinehost/deployment_folder/
Suffix: users/bob/settings
http://1.1.1.1/root/~machinehost/deployment_folder/
/root/~machinehost/deployment_folder/users/bob/settings
Longuest match: root/~machinehost/deployment_folder/
Suffix: users/bob/settings
http://2.2.2.2/~machinehost/deployment_folder/users/
/~machinehost/deployment_folder/users/bob/settings
Longuest match: ~machinehost/deployment_folder/users/
Suffix: bob/settings
http://3.3.3.3/~machinehost/~machinehost/subDirectory/deployment_folder/
/~machinehost/subDirectory/deployment_folderX/users/bob/settings
Longuest match: ~machinehost/subDirectory/
Suffix: deployment_folderX/users/bob/settings
也许您可以理解这段代码,并将其转化为您认为对当前项目有用的内容。 让我知道它是否也适合你。顺便说一句,oreX先生的答案看起来也不错。
答案 5 :(得分:-1)