计算每行出现不同单词的次数

时间:2016-08-10 18:37:01

标签: mysql sql

我有一个像这样的mysql表:

id         content
-----      ------
1          Big green tree
2          Small green tree
3          Green tree 
4          Small yellow tree
5          Big green lake

我想计算每行出现不同单词的次数。

示例:如果我搜索绿色。它应该返回如下结果:

id         count
-----      ------
1          3
2          2
3          2 
4          1
5          2

我尝试过类似的事情:

SELECT `content`
     , COUNT(*) as count 
  FROM `elements` 
 WHERE `content` LIKE '%Big%' 
    OR `content` LIKE '%green%' 
    OR `content` LIKE '%tree%' 
GROUP 
    BY `id` 
 ORDER BY count DESC;

它不起作用,因为它只为每个匹配返回一行:

id         count
-----      ------
1          1
2          1
3          1 
4          1
5          1

3 个答案:

答案 0 :(得分:4)

您可以将regexp与字边界结合使用。产生的匹配不区分大小写。如果需要区分大小写匹配,请使用REGEXP BINARY

SELECT `content`, 
CASE WHEN `content` REGEXP '[[:<:]]big[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]green[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]tree[[:>:]]' THEN 1 ELSE 0 END
       as num_matches        
FROM `elements`
ORDER BY id

Sample Fiddle

修改:根据OP的评论,获取num_matches&gt;的行0

SELECT * FROM (
SELECT `content`, 
CASE WHEN `content` REGEXP '[[:<:]]big[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]green[[:>:]]' THEN 1 ELSE 0 END +
CASE WHEN `content` REGEXP '[[:<:]]tree[[:>:]]' THEN 1 ELSE 0 END
       as num_matches        
FROM `elements`) t
WHERE num_matches > 0

答案 1 :(得分:3)

如果您不关心content中的重复字词:

SELECT `content`, 
       ((CASE WHEN `content` LIKE '%Big%' THEN 1 ELSE 0 END) +
        (CASE WHEN `content` LIKE '%green%' THEN 1 ELSE 0 END) +
        (CASE WHEN `content` LIKE '%lake%' THEN 1 ELSE 0 END)
       ) as matches        
FROM `elements`
WHERE `content` LIKE '%Big%' OR
      `content` LIKE '%green%' OR
      `content` LIKE '%tree%'
ORDER BY matches DESC;

答案 2 :(得分:1)

如果您不想使用CASE - 您可以计算如下字样:

SELECT id, COUNT(*) as count 
  FROM (
     select id from elements WHERE content LIKE '%Big%'
     union all 
     select id from elements WHERE content LIKE '%green%'
     union all 
     select id from elements WHERE content LIKE '%tree%'
  ) as t
GROUP BY id
ORDER BY count DESC;