如何从数组中删除重复的单词?

时间:2014-01-08 13:34:16

标签: php arrays unique

我有一个数据库查询,可以提取文本字符串

$descriptionsQuery = mysql_query("select prob_text from opencall where logdatex between $OneHourAgo and $TimeNow ORDER by callref DESC") or die(mysql_error());
$descriptions = array();

while ($row = mysql_fetch_assoc($descriptionsQuery)){
$descriptions[] = $row['prob_text'];
}
//put all the strings together with a space between them
$glue = implode (" ",$descriptions);

我想要帮助的是......在“说明[]”被“粘合”为一个长字符串之前,我想要删除任何重复的单词。一旦他们被胶合,我依赖于每个原始描述中都有重复的单词。这有点难以解释,这是我的意思的一个例子。例如,2个用户输入一些文本 用户1:"I have an issue with Leeds server. I am in Leeds" 用户2:"Margaret in Leeds has a problem, please call margaret"。从这一点来说,我希望User1在最终的粘合字符串中只有1个“利兹”,User2只有1个玛格丽特,但是两个用户都提到“利兹”,所以我希望在胶合字符串中有两次,每个用户一次。这可能吗?任何帮助表示赞赏。

4 个答案:

答案 0 :(得分:5)

您可以使用$newarray = array_unique($oldarray)

执行此操作

首先展开每一行以获取数组。使用array_unique()删除重复项。然后内爆你的每一行,然后内爆所有行。

$descriptionsQuery = mysql_query("select prob_text from opencall where logdatex between $OneHourAgo and $TimeNow ORDER by callref DESC") or die(mysql_error());
$descriptions = array();

while ($row = mysql_fetch_assoc($descriptionsQuery)){
  $tmp = explode(' ', $row['prob_text']);
  $tmp = array_unique($tmp);
  // or case insensitive
  // $tmp = array_intersect_key($array,array_unique(array_map(strtolower,$array)));
  $descriptions[] = implode(' ', $tmp);
}
//put all the strings together with a space between them
$glue = implode (" ",$descriptions);

http://de3.php.net/function.array-unique

如果要以不区分大小写的方式删除重复项,则必须更改while中的第二行。我在这里找到了提示: Best solution to remove duplicate values from case-insensitive array

答案 1 :(得分:1)

最好在查询中执行此操作。

您可以执行类似

的操作
SELECT DISTINCT prob_text FROM opencall WHERE logdatex BETWEEN $OneHourAgo AND $TimeNow ORDER BY callref DESC

这只会在您的数据库中选择一次单词,因此您不会选择任何重复项。

http://dev.mysql.com/doc/refman/5.0/en/distinct-optimization.html

答案 2 :(得分:0)

使用array_unique。或在查询中使用DISTINCT

$descriptionsQuery = mysql_query("select prob_text from opencall where logdatex between $OneHourAgo and $TimeNow ORDER by callref DESC") or die(mysql_error());
$descriptions = array();

while ($row = mysql_fetch_assoc($descriptionsQuery)){
$descriptions[] = $row['prob_text'];
}

//remove duplicates:
$descriptions = array_unique($descriptions);

//put all the strings together with a space between them
$glue = implode (" ",$descriptions);

答案 3 :(得分:0)

似乎是使用array_walkanonymous functions的好时机。这将过滤掉单个邮件中的所有重复单词,忽略大小写:

// $chat is the db result array
foreach($chat as &$msg) {
    $final = [];
    array_walk(str_word_count($msg, 1), function($word) use (&$final) {
        if (!in_array(strtolower($word), array_map('strtolower', $final))) {
            $final[] = $word;
        }
    });
    $msg = implode(' ', $final);
});        
$filtered = implode(' ', $chat);

请注意使用str_word_count()而不是explode()。我没有在生产环境中对此进行测试,但它会删除基本标点符号('-除外);在尝试创建标签云时可能很有用。