Question

您好我有一个看起来像这样的数据库表

word_id int(10)
word varchar(30)

我有一个文本，我想知道本文中哪一个词是在该表中定义的，这样做最优雅的方式是什么？

目前我在数据库中查询所有单词，然后使用PHP我搜索整个文本中的每个单词，因此PHP需要很长时间才能从数据库中下载所有单词，然后检查每个单词他们反对我的文字。

Answer 1

您可以尝试提取文本中的单词并将它们放在SELECT查询中，如下所示：

$words = array_unique(get_words_in_text(...));
$sql = "SELECT * FROM words WHERE word IN (".implode(", ", $words)).")";

可能是你的SQL引擎优化了这个语句。在任何情况下，数据库连接的利用率都低于当前方法。

您还可以尝试临时创建单独的单词表，并将文本中的所有单词添加到该表中。然后，您可以使用主词表执行JOIN。如果两个表都已正确编入索引，则可能非常快。

编辑：这个问题/答案表明创建临时表确实更快（请参阅评论）：mysql select .. where .. in -> optimizing。但是，它当然取决于您正在使用的具体数据库，单词表的大小，文本的大小以及索引的配置。因此，我建议针对您的特定方案评估这两种方法。请报告您的结果。： - ）

Answer 2

一个想法：

// get words in file into array
$file = file_get_contents('file.txt', FILE_IGNORE_NEW_LINES);
$file_words = explode(" ", $file);

// remove duplicate words, count elements in array after de-duplication
$file_words = array_unique($file_words);
$file_count = count($file_words);

// create empty array in which to store hits
$words_with_definition = array();

// check to see if each word exists in database
for ($i=0; $i < $file_count; $i++)
{
    // intentionally leaving out db connection, this is just a concept
    // word should be at least three characters, change as needed
    if (strlen($file_words[$i]) >= 3)
    {
        $sql = "SELECT word FROM your_table WHERE word='".$file_words[$i]."'";

        if (mysql_num_rows($sql) > 0)
        {
            // this is a hit, add it to $words_with_definition
            array_push($words_with_definition, $file_words[$i]);
        }
    }
}

$ words_with_definition数组中的任何内容都是击中数据库的单词。

在文本中搜索预定义的单词

2 个答案: