多项查询

时间:2012-10-16 20:13:46

标签: php sql

我有一个搜索引擎,扫描给定网页中的所有单词,然后显示它们的出现。然后按照该单词在文档中出现的出现量进行排名。但它不会返回多个术语查询。

下面是我的SQL查询。我希望能够检查所有输入的单词,然后根据单词出现在文档中的次数进行排名。它目前只适用于单期查询。

         $result = mysql_query(" SELECT p.page_url AS url,
                       COUNT(*) AS occurrences 
                       FROM page p, word w, occurrence o
                       WHERE p.page_id = o.page_id AND
                       w.word_id = o.word_id AND
                       w.word_word = \"$keyword\"
                       GROUP BY p.page_id
                       ORDER BY occurrences DESC
                       LIMIT $results" );

2 个答案:

答案 0 :(得分:1)

如果您想获得所有单词,那么您的加入条件将不允许您这样做

w.word_word = \"$keyword\"

您的查询可以写成如下

$sql = "SELECT p.page_url as url, COUNT(*) as occurences "
     . "FROM page p "
     . "INNER JOIN occurence o ON p.page_id = o.page_id "
     . "INNER JOIN word w ON w.word_id = o.word_id "
     . "GROUP BY p.page_id "
     . "ORDER BY occurences DESC "
     . "LIMIT {$results}";
$result = mysql_query($sql);

这将获取word表中的所有单词,从而为您提供(据我所知)需要的结果。

如果您对几个单词感兴趣,那么您可以使用IN语句(在评论中由Dev建议),您的查询将变为:

$my_keywords = array('apple', 'banana');
// This produces: "apple", "banana" and assumes that all of your 
// keywords are in lower case. If not, you can transform them to lower
// case or if you don't want that, remove the LOWER() function below 
// from the WHERE
$keywords    = '"' . implode('","', $my_keywords) . '"';
$sql = "SELECT p.page_url as url, COUNT(*) as occurences "
     . "FROM page p "
     . "INNER JOIN occurence o ON p.page_id = o.page_id "
     . "INNER JOIN word w ON w.word_id = o.word_id "
     . "WHERE LOWER(w.word_word) IN ({$keywords}) "
     . "GROUP BY p.page_id "
     . "ORDER BY occurences DESC "
     . "LIMIT {$results}";
$result = mysql_query($sql);

最后,尝试使用mysqli代替mysql或PDO。

HTH

答案 1 :(得分:1)

我将使用MATCH-AGAINST,这对于像搜索引擎这样的MySQL优化搜索应该更好。您应该查看全文搜索:http://dev.mysql.com/doc/refman/5.5/en//fulltext-search.html

注意:在MySQL表中,应该在数据库表中将其作为关键字行的FULLTEXT进行索引。 这将为搜索提供更好的表现。

示例:

输入关键字示例:

$ keywords ='+ Word + Word2 + Word3';

SELECT p.page_url AS url,
COUNT(*) AS occurrences, MATCH('w.word_word') AGAINST ('$keywords') as keyword FROM page p, occurrence o, w.word WHERE MATCH
('w.word_word') AGAINST('{$keywords}' IN 
BOOLEAN MODE) 
AND p.page_id = o.page_id AND w.word_id = o.word_id
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results

在其他非优化模式下,如果您的查询未被优化,则会降低性能服务器的风险(太多组,其中包含子句和条件)。而不是这个,你可以在MySQL中使用REGULAR EXPRESSION,例如:

REGEXP "/(honda)|(jazz)|(manual)/"

使用正则表达式(不推荐用于大型数据库)也可以获得良好的性能:

制作循环并计算它而不是放入REGEXP:

$keywords = "keyword1,keyword2,keyword3";

$expl = explode("," $keywords);

if (count($expl) == 1)
{
    $all = w.word_word REGEXP = '[[:<:]]$keywords[[:>:]]';
}
else
{
    $all = '';
    foreach ($expl as $keyone)
    {
        $all .= 'OR '.w.word_word REGEXP = '[[:<:]]$keyone[[:>:]]';
    }
}

$sql =  'SELECT p.page_url AS url,
COUNT(*) AS occurrences 
FROM page p, word w, occurrence o
WHERE p.page_id = o.page_id AND
w.word_id = o.word_id AND
$all
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results';

$result_query = mysql_query($sql);