我有许多具有相似性的字符串列表,例如:
$str = array('monkey eat a banana',
'dog eat a banana',
'cat devour an apple',
'cat dine a coco'); //etc
我想从这个数组中提取彼此最不同的X字符串。 例如:如果我要提取3个,它将是:'猴子吃香蕉'和'猫吃椰子'和'猫吞食苹果'。
我该如何实现?我找到了similar_text()函数,我想我可以使用它,但是如何使用X的任何值来提取它们?
感谢您的建议
ps:我将此用于SEO,目标是避免最可能的重复内容。
答案 0 :(得分:1)
使用以下示例代码进行测试,结论是:从percentage
中选择similar_text()
最低的字符串,它们是最不同的字符串。
$str = array('monkey eat a banana',
'dog eat a banana',
'cat devour an apple',
'cat dine a coco');
$len = count($str);
echo '<table width="100%">';
for($i=0; $i<$len; $i++) {
for($j=0; $j<$len; $j++) {
if($i==$j) contiue;
$num = similar_text($str[$i], $str[$j], $percent );
echo '<tr><td>' . $str[$i] . '<td>' . $str[$j] . '<td>' . strlen($str[$i]) . '<td>' . strlen($str[$j]). '<td>' . $num. '<td>' . number_format($percent, 0);
}
}
echo '</table>';
结果如下:
string 1 string 2 percentage
monkey eat a banana monkey eat a banana 19 19 19 100
monkey eat a banana dog eat a banana 19 16 14 80
monkey eat a banana cat devour an apple 19 19 7 37
monkey eat a banana cat dine a coco 19 15 5 29
dog eat a banana monkey eat a banana 16 19 14 80
dog eat a banana dog eat a banana 16 16 16 100
dog eat a banana cat devour an apple 16 19 7 40
dog eat a banana cat dine a coco 16 15 5 32
cat devour an apple monkey eat a banana 19 19 7 37
cat devour an apple dog eat a banana 19 16 7 40
cat devour an apple cat devour an apple 19 19 19 100
cat devour an apple cat dine a coco 19 15 9 53
cat dine a coco monkey eat a banana 15 19 5 29
cat dine a coco dog eat a banana 15 16 5 32
cat dine a coco cat devour an apple 15 19 9 53
cat dine a coco cat dine a coco 15 15 15 100
答案 1 :(得分:1)
$希望有所帮助
$str = array(
'cat devour an apple',
'dog eat a banana',
'monkey eat a banana',
'cat dine a coco',
); //etc
$overal_scores = [];
foreach ($str as $i => $s) {
$overal_scores[$i] = 0;
foreach ($str as $j => $d) {
if ($i != $j) {
$overal_scores[$i] += similar_text($s, $d);
}
}
}
asort($overal_scores);
$x = 3;
$results_index = array_slice(array_keys($overal_scores), 0, $x);
$result_str = [];
foreach ($results_index as $index) {
$result_str[] = $str[$index];
}
var_dump($result_str);