我有一个包含单词和句子的字符串数组。
例如:
array("dog","cat","the dog is running","some other text","some","text")
我想删除重复的单词,只保留唯一的单词。我甚至想删除句子中的这些词。
结果应类似于:
array("dog","cat","the is running","other","some","text")
我尝试了array_unique
函数,但是没有用。
答案 0 :(得分:0)
您可以将array_unique
循环用于爆炸和array_push
:
$res = [];
foreach($arr as $e) {
array_push($res, ...explode(" ", $e));
}
print_r(array_unique($res));
参考: array_push,explode,array-unique
实时示例:3v4l
如果要保留句子,请使用:
$arr = array("dog","cat","the dog is running","some other text","some","text");
// sort first to get the shortest sentence first
usort($arr, function ($a, $b) {return count(explode(" ", $a)) - count(explode(" ", $b)); });
$words = [];
foreach($arr as &$e) {
$res[] = trim(strtr($e, $words)); //get the word after swapping existing
foreach(explode(" ", $e) as $w)
$words[$w] =''; //add all new words to the swapping array with value of empty string
}
答案 1 :(得分:0)
这个解决方案并不漂亮,但是应该可以完成工作并满足手边的一些情况。我假设一个句子字符串中的单词分隔的空格不超过一个,并且您想保留原始顺序。
方法是遍历数组两次,一次过滤出重复的单个单词,然后再次过滤出句子中的重复单词。这样可以保证单个单词的优先级。最后,ksort
数组(从时间复杂度的角度来看这是丑陋的部分:到目前为止,一切都O(max_len_sentence * n)
。
$arr = ["dog","cat","the dog is running","some other text","some","text"];
$seen = [];
$result = [];
foreach ($arr as $i => $e) {
if (preg_match("/^\w+$/", $e) &&
!array_key_exists($e, $seen)) {
$result[$i] = $e;
$seen[$e] = 1;
}
}
foreach ($arr as $i => $e) {
$words = explode(" ", $e);
if (count($words) > 1) {
$filtered = [];
foreach ($words as $word) {
if (!array_key_exists($word, $seen)) {
$seen[$word] = 0;
}
if (++$seen[$word] < 2) {
$filtered[]= $word;
}
}
if ($filtered) {
$result[$i] = implode($filtered, " ");
}
}
}
ksort($result);
$result = array_values($result);
print_r($result);
Array
(
[0] => dog
[1] => cat
[2] => the is running
[3] => other
[4] => some
[5] => text
)