Question

我想知道如何计算用户提交的单词在我的MySQL数据库中存储的文章中出现的次数，然后显示从最高到最低出现的结果。

这是我的PHP＆amp;下面的MySQL代码。

$x = 0;
$con = null;
$search = $_REQUEST['search'];

$search_explode = mysqli_real_escape_string($dbc, $search);
$search_explode = explode(' ', $search_explode);

foreach($search_explode as $search_each) {
    $x++;
    if($x == 1){
        $con .= " article_content LIKE '%$search_each%' OR title LIKE '%$search_each%' OR summary LIKE '%$search_each%'";
    } else {
        $con .= " OR article_content LIKE '%$search_each%' OR title LIKE '%$search_each%' OR summary LIKE '%$search_each%'";
    }
}

$con = "SELECT users.*, users_articles.* FROM users_articles
              INNER JOIN users ON users_articles.user_id = users.user_id
              WHERE ($con) 
              AND users.active IS NULL
              AND users.deletion = 0";

$run =  mysqli_query($dbc, $con);
$search_term = mysqli_num_rows($run);

Answer 1

将文章作为字符串存储在某个变量中后，可以使用substr_count查找特定字符串的出现次数。

如果您需要有关文章中使用的单词的一般信息，可以使用str_word_count获取字符串中所有单词的列表，然后使用它。

Answer 2

您希望在字符串中找到所有出现的单词：

<?php 
function findall($needle, $haystack) 
{ 
    //Setting up 
    $buffer=''; //We will use a 'frameshift' buffer for this search 
    $pos=0; //Pointer 
    $end = strlen($haystack); //The end of the string 
    $getchar=''; //The next character in the string 
    $needlelen=strlen($needle); //The length of the needle to find (speeds up searching) 
    $found = array(); //The array we will store results in 

    while($pos<$end)//Scan file 
    { 
        $getchar = substr($haystack,$pos,1); //Grab next character from pointer 
        if($getchar!="\n" || buffer<$needlelen) //If we fetched a line break, or the buffer is still smaller than the needle, ignore and grab next character 
        { 
            $buffer = $buffer . $getchar; //Build frameshift buffer 
            if(strlen($buffer)>$needlelen) //If the buffer is longer than the needle 
            { 
                $buffer = substr($buffer,-$needlelen);//Truncunate backwards to needle length (backwards so that the frame 'moves') 
            } 
            if($buffer==$needle) //If the buffer matches the needle 
            { 
                $found[]=$pos-$needlelen+1; //Add the location of the needle to the array. Adding one fixes the offset. 
            } 
        } 
        $pos++; //Increment the pointer 
    } 
    if(array_key_exists(0,$found)) //Check for an empty array 
    { 
        return $found; //Return the array of located positions 
    } 
    else 
    { 
        return false; //Or if no instances were found return false 
    } 
} 
?>

来自http://php.net/manual/en/function.strstr.php

...

另一个：

 <?php
function find_occurences($string, $find) {
    if (strpos(strtolower($string), strtolower($find)) !== FALSE) {
        $pos = -1;
        for ($i=0; $i<substr_count(strtolower($string), strtolower($find)); $i++) {
            $pos = strpos(strtolower($string), strtolower($find), $pos+1);
            $positionarray[] = $pos;
        }

        return $positionarray;
    }
    else {
        return FALSE;
    }

}

来自http://www.phpfreaks.com/forums/index.php?topic=195567.0

Answer 3

使用全文搜索非常简单。例如：

SELECT *,
MATCH(title, body) AGAINST ('PHP') AS score
FROM articles
WHERE MATCH(title, body) AGAINST('PHP')

根据MySQL手册，全文是一种“自然语言搜索”;它使用您指定的列索引表示该行的单词。例如，如果您的所有行都包含“MySQL”，那么“MySQL”将不会匹配太多。它不是非常独特，它会返回太多结果。但是，如果只有5％的行中存在“MySQL”，那么它将返回这些行，因为它不会经常出现，因为它不常被称为非常常见的关键字。（如果你的行中没有“MySQL”，那么它什么都不返回;呃。）

MySQL也做了一些非常有用的事情。它创造了一个分数。此分数通常类似于.9823475或.124874，但始终大于零。它的范围可以超过1，我有时会看到4。（不要试图将它乘以100并将其描绘为％值;人们会想知道为什么他们的关键字匹配431％的文章！）

MySQL也将按其得分排序，降序。

另一个有用的消息：如果您使用MATCH（）AGAINST（）在查询中将此文档样式更改为“内联代码”两次，就像我们一样，没有额外的速度惩罚。您可能期望因为您执行相同的搜索两次，查询将花费两倍的时间，但实际上MySQL会记住第一次搜索的结果，因为它运行第二次搜索。

要了解详情，请参阅：http://devzone.zend.com/article/1304

如何计算存储在数据库问题中的文章中存在用户提交数据的次数

3 个答案: