在PHP中使用句子删除数组中的重复单词

时间:2019-04-10 19:30:23

标签: php arrays unique

我有一个包含单词和句子的字符串数组。

例如:

array("dog","cat","the dog is running","some other text","some","text")

我想删除重复的单词,只保留唯一的单词。我甚至想删除句子中的这些词。

结果应类似于:

array("dog","cat","the is running","other","some","text")

我尝试了array_unique函数,但是没有用。

2 个答案:

答案 0 :(得分:0)

您可以将array_unique循环用于爆炸和array_push

$res = [];
foreach($arr as $e) {
    array_push($res, ...explode(" ", $e));
}
print_r(array_unique($res));

参考: array_pushexplodearray-unique

实时示例:3v4l

如果要保留句子,请使用:

$arr = array("dog","cat","the dog is running","some other text","some","text");

// sort first to get the shortest sentence first
usort($arr, function ($a, $b) {return count(explode(" ", $a)) - count(explode(" ", $b)); });

$words = [];
foreach($arr as &$e) {
    $res[] = trim(strtr($e, $words)); //get the word after swapping existing
    foreach(explode(" ", $e) as $w)
        $words[$w] =''; //add all new words to the swapping array with value of empty string
}

答案 1 :(得分:0)

这个解决方案并不漂亮,但是应该可以完成工作并满足手边的一些情况。我假设一个句子字符串中的单词分隔的空格不超过一个,并且您想保留原始顺序。

方法是遍历数组两次,一次过滤出重复的单个单词,然后再次过滤出句子中的重复单词。这样可以保证单个单词的优先级。最后,ksort数组(从时间复杂度的角度来看这是丑陋的部分:到目前为止,一切都O(max_len_sentence * n)

$arr = ["dog","cat","the dog is running","some other text","some","text"];
$seen = [];
$result = [];

foreach ($arr as $i => $e) {
    if (preg_match("/^\w+$/", $e) && 
        !array_key_exists($e, $seen)) {
        $result[$i] = $e;
        $seen[$e] = 1;
    }
}

foreach ($arr as $i => $e) {
    $words = explode(" ", $e);

    if (count($words) > 1) {
        $filtered = [];

        foreach ($words as $word) {
            if (!array_key_exists($word, $seen)) {
                $seen[$word] = 0;
            }

            if (++$seen[$word] < 2) {
                $filtered[]= $word;
            }
        } 

        if ($filtered) {
            $result[$i] = implode($filtered, " ");
        }
    }
}

ksort($result);
$result = array_values($result);
print_r($result);

输出

Array
(
    [0] => dog
    [1] => cat
    [2] => the is running
    [3] => other
    [4] => some
    [5] => text
)