计算数据库行中特定单词的数量

时间:2016-02-16 12:01:00

标签: php mysql database count

我有两个单词列表和一个包含超过一千篇新闻文章的数据库。

我想计算数据库中每篇文章中 $ badwords $ goodwords 列表中有多少单词。接下来,我想在 badwords goodwords 列中每行保存两行($ badwords和$ goodwords)。我将使用 cronjob 运行此脚本。

我当前的表格结构 最后两行是空的

TABLE news
-----------------
|ID|newstitle|newscontent|badwords|goodwords|
|1| Rain in London | It is horrible depressive weather in this nice city. | EMPTY | EMPTY |
|2| France wins the WorldCup | The player made a great goal. | EMPTY | EMPTY |

我想要的表结构 最后两列中$ badwords和$ goodwords的数量

TABLE news
-----------------
|ID|newstitle|newscontent|badwords|goodwords|
|1| Rain in London | It is horrible depressive weather in this nice city. | 2 | 1 |
|2| France wins the WorldCup | The player made a great goal. | 0 | 1 |

我当前的PHP代码

<?php
//the wordlists
$badwords = "depressive horrible";
$goodwords = "great";

//connection to the database
$servername = "localhost";
$username = "user";
$password = "pass";
$dbname = "db";

$conn = new mysqli($servername, $username, $password, $dbname);

// here is my sql query

$sql = " UPDATE news
set badwords = (SELECT count (*) from news
where newscontent LIKE '.%$badwords%.')";    

//close the connection
$conn->close();
?>

1 个答案:

答案 0 :(得分:0)

如果我正确理解您的问题,您需要检查数据库中是否存在某个词表。在这种情况下,您正在寻找类似的查询(也可以根据您使用的db类在查询中使用转义,例如mysqli_real_escape_string()):

SELECT    COUNT(*) AS `count`
         ,`newscontent` 
FROM      `news`
WHERE     `newscontent` = '" . $wordlist . "'

如果要显示数据库中每个词表的存在次数,这就是您所需要的:

SELECT    COUNT(*) AS `count`
         ,`newscontent` 
FROM      `news`
GROUP BY  `newscontent`

如果您想显示给定数量字词的字符串数量,这就是您要寻找的内容:

<?php
  $sql = new mysqli($host, $user, $password, $database);
  $query = $sql->query('select * from `news`');
  $summary = [];

  while($record = $query->fetch_object()) { 
    $summary[count(explode(' ', $record->newscontent))]++;
  }

  echo '<pre>';
  print_r($summary);
  echo '</pre>';

如果以上都不是您所寻找的,那么在阅读完4次问题之后,我完全不知道您将会追究什么。

更新回答 由于您已更新了问题,因此我了解您的需求。请参阅下面的更新答案。

<?php
  // your db connection ...

  // array with good and bad words
  $good = [
    'awesome',
    'neat',
    'fantastic',
    'great',
    // and so on
  ];

  $bad = [
    'horrible',
    'worst',
    'bad',
    'terrific',
    // and so on
  ];

  // if you keep using your string approach you can set $good and $bad with $good = explode(' ', $goodwords); and $bad = explode(' ', $badwords);

  // fetch the record you need
  $query = $sql->query('select * from `news` where `ID` = 1'); // insert parameter for your ID here instead of just 1
  $newsitem = $query->fetch_object();

  // set up good and bad word counters
  $totalGood = 0;
  $totalBad = 0;

  // check how many times each word is mentioned in newscontent
  foreach($good as $word) { 
    // add spaces arround the word to make sure the full word is matched, not a part
    $totalGood += substr_count($newsitem->newscontent, ' ' . $word . ' ');
  }

  // check how many times each word is mentioned in newscontent
  foreach($bad as $word) { 
    // add spaces arround the word to make sure the full word is matched, not a part
    $totalBad += substr_count($newsitem->newscontent, ' ' . $word . ' ');
  }  

  // update the record
  $sql->query("
    update `news` 
    set `badwords` = " . $totalBad . ",
        `goodword` = " . $totalGood . "
    where `ID` = " . $newsitem->ID);

关于文本解释的一个有趣的事情仍然是讽刺。你会怎么处理像#34;那么,英格兰的天气再次变得很好 - 像往常一样!&#34; ;)祝你好运!