从数组和Randomise和限制数组中排除重复的单词

时间:2013-04-18 10:05:46

标签: php arrays

我正在从文章主题标题创建标签云。 我得到每个标题,分成单词并将它们放在一个数组中,检查单词strlen> 3而不是我排除的单词数组。这很有效....

我挣扎的是:

  • 如何随机化订单并将输出限制为20
  • 不包括重复项,重复项是指重复的单词,但在同一个catid中。

例如,下面的单词dog重复5次,但在3个不同的catid中重复。因此,我想为每个不同的catid输出3次单词dog。     阵列:

'subject' => 'dog is running', 'id' => '1', 'catid' => '19'

'subject' => 'dog is walking', 'id' => '2', 'catid' => '18'

'subject' => 'dog is sitting', 'id' => '3', 'catid' => '18'

'subject' => 'dog is eating', 'id' => '4', 'catid' => '19'

'subject' => 'dog is barking', 'id' => '5', 'catid' => '20'

这是我的代码:     

$excluded_word_array = array('a','blah','bleh');

// prepare the tag cloud array for display
$terms = array(); // create empty array

$query = mysql_query("SELECT * FROM hesk_kb_articles WHERE type = '0'");
while($row = mysql_fetch_array($query)){

        $subject = $row['subject'];
        $id = $row['id'];
        $catid = $row['catid'];
        $words = explode(" ", $subject);
        foreach ($words as $val){
                if (strlen($val) > 3) {
                        $stripped_val = strtolower(ereg_replace("[^A-Za-z]", "", $val));
                        if (!in_array($stripped_val, $excluded_word_array)) {
                        shuffle($stripped_val);
                        $terms[] = array('subject' => $stripped_val, 'id' => $id, 'catid' => $catid);
                        }
                }
        }
}

sort($terms);
?>

2 个答案:

答案 0 :(得分:1)

您可以使用 Group BY

$query = mysql_query("SELECT * FROM hesk_kb_articles WHERE type = '0' GROUP BY subject, catid");

自PHP 5.5.0起, mysql * 函数也不推荐使用,将来也会被删除。相反,应使用MySQLiPDO_MySQL扩展程序

<强> UPDATE1:

也许这可以帮到你:

$excluded_word_array = array('a','blah','bleh');
$query = mysql_query("SELECT * FROM hesk_kb_articles WHERE type = '0'");
while($row = mysql_fetch_array($query)){

    $subject = $row['subject'];
    $id = $row['id'];
    $catid = $row['catid'];
    $words = explode(" ", $subject);
    foreach ($words as $val){
        if (strlen($val) > 3) {
            $stripped_val = strtolower(preg_replace("[^A-Za-z]", "", $val));
            if (!in_array($stripped_val, $excluded_word_array)) {
                $terms[$catid][] = $stripped_val;
            }
        }
    }
}

$items = array();
foreach ($terms as $term) {
    $term = array_unique($term);
    $items = array_merge($items, $term);
}

$items将包含您想要的所有字词。

更新2:

如果您想要catid以及单词,请更改最后一个for循环:

$i = 0;
$items = array();
foreach ($terms as $term_key => $term_value) {
    $term_value = array_unique($term_value);
    $items[$i]['catid'] = $term_key;
    $items[$i]['words'] = implode(',', $term_value);
    $i++;
}

现在$ items将包含catid和用逗号分隔的单词。

更新3:

如果你想让每个catid和单词分开,你可以这样做:

$i = 0;
$items = array();
foreach ($terms as $term_key => $term_value) {
    $term_value = array_unique($term_value);
    foreach ($term_value as $term) {
         $items[$i]['catid'] = $term_key;
         $items[$i]['words'] = $term;
         $i++;
    }
}

希望这可以帮助你:)

答案 1 :(得分:0)

$query = mysql_query("SELECT * FROM hesk_kb_articles WHERE type = '0'");

使用SELECT *,你不能DISTINCT或GROUP BY,你必须只选择你需要的字段。

这样的查询会有所帮助

$query = mysql_query("SELECT DISTINCT subject, catid FROM hesk_kb_articles WHERE type = '0'");

$query = mysql_query("SELECT subject, catid FROM hesk_kb_articles WHERE type = '0' group by subject, catid");

ID列是唯一的,因此您无法使用SQL来减少返回的记录数量。如果您需要PHP代码中的ID。如果需要,您可以通过subject-catid couple从SQL获取id。

你应该在你的表上考虑一个UNIQUE约束,以避免这种类型的“重复”,如果一个类别中存在Tag主题你不应该再次插入它。