匹配/比较PHP中的文本字符串
大家好, 我想比较一些字符串,基本上是为了了解我是否在产品的Feed中有产品。由于来源不同,完美匹配(相同)并不确定。 由于产品的名称有时会有更多或更少的字符(iPad白色和iPad Apple白色),我想做一个近似匹配,也许类似于Lucene中的模糊搜索(〜)。
我知道到目前为止并使用了preg_match和levenshtein。你能推荐任何其他方法来为PHP的字符串进行相似性匹配吗?
答案 0 :(得分:2)
您问过是否有人有使用想法:嗯,这是来自PHP
网站的示例,但我想它可以帮助您。
(我已修改代码以适合您网站上的某种体验):
<?php
$productString= 'Apple white IPOD';
// array of words to check against
$products = array('zen','dell laptop','apple laptop','apple black ipod',
'apple mini','Random product');
// no shortest distance found, yet
$shortest = -1;
// loop through products to find the closest product
foreach ($products as $product) {
// calculate the distance between the input word,
// and the current word
$lev = levenshtein($productString, $product);
// check for an exact match
if ($lev == 0) {
// closest word is this one (exact match)
$closest = $product;
$shortest = 0;
// break out of the loop; we've found an exact match
break;
}
// if this distance is less than the next found shortest
// distance, OR if a next shortest word has not yet been found
if ($lev <= $shortest || $shortest < 0) {
// set the closest match, and shortest distance
$closest = $word;
$shortest = $lev;
}
}
echo "Search product: $productString\n";
if ($shortest == 0) {
echo "Exact match found: $closest\n";
} else {
echo "Did you mean: $closest?\n";
}
?>
上面的代码搜索产品列表,数组,并找到最接近的匹配项。如果找到完全匹配,则使用该匹配。