用MySQL计算a-d,e-h,i-l ......范围内的姓氏

时间:2011-10-11 10:53:05

标签: mysql

我有一个姓氏表,我想计算A-D或E-H等每个字母范围内的姓氏数。

我想出了下面的查询,它有效,我希望听到人们对它的看法,也许还有更好的方法。

select count(*) FROM people 
group by surname REGEXP '^[a-d].*', 
         surname REGEXP '^[e-h].*', 
         surname REGEXP '^[i-l].*', 
         surname REGEXP '^[m-p].*', 
         surname REGEXP '^[q-t].*', 
         surname REGEXP '^[u-z].*';

4 个答案:

答案 0 :(得分:4)

这是实现这一目标的最佳方法(无论如何使用正则表达式):

select
    sum(surname REGEXP '^[a-dA-D].*') as ad_count,
    sum(surname REGEXP '^[e-hE-H].*') as eh_count,
    sum(surname REGEXP '^[i-lI-L].*') as il_count,
    sum(surname REGEXP '^[m-pM-P].*') as mp_count,
    sum(surname REGEXP '^[q-tQ-T].*') as qd_count,
    sum(surname REGEXP '^[u-zU-Z].*') as uz_count
from people

由于在mysql中,true1false0,因此sum(some condition)是如何优雅的简洁工作很多时候都是如此。

顺便说一句,我在你的正则表达式中添加了大写字母。

通过从内部选择中选择更有效地计算组的工作(例如,通过使用substr(surname,1,1)上的情况),然后在针对该计算的值的值的测试上求和,您将获得更好的性能。

答案 1 :(得分:2)

正则表达式过度,完全不需要。

也许是这样的,使用基本的字符串代数:

SELECT
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'a' AND 'd' THEN 1 ELSE 0 END) AS `SUM_a-d`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'e' AND 'h' THEN 1 ELSE 0 END) AS `SUM_e-h`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'i' AND 'l' THEN 1 ELSE 0 END) AS `SUM_i-l`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'm' AND 'p' THEN 1 ELSE 0 END) AS `SUM_m-p`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'q' AND 't' THEN 1 ELSE 0 END) AS `SUM_q-t`,
   SUM(CASE WHEN SUBSTR(`surname`, 1, 1) BETWEEN 'u' AND 'z' THEN 1 ELSE 0 END) AS `SUM_u-z`
FROM `people`

答案 2 :(得分:0)

您可以使查询更明确,如下所示:

SELECT 
  SUM(CASE WHEN  surname REGEXP '^[a-d].*' THEN 1 ELSE 0 END) AS a_d_count
  ,SUM(CASE WHEN surname REGEXP '^[e-h].*' THEN 1 ELSE 0 END) AS e_h_count
  ,SUM(CASE WHEN surname REGEXP '^[i-l].*' THEN 1 ELSE 0 END) AS i_l_count
  ,SUM(CASE WHEN surname REGEXP '^[m-p].*' THEN 1 ELSE 0 END) AS m_p_count
  ,SUM(CASE WHEN surname REGEXP '^[q-t].*' THEN 1 ELSE 0 END) AS q_t_count
  ,SUM(CASE WHEN surname REGEXP '^[u-z].*' THEN 1 ELSE 0 END) AS u_z_count
FROM (SELECT surname FROM people ORDER BY surname ASC) p

答案 3 :(得分:0)

避免使用正则表达式和条件,您可以这样做:

SELECT CONCAT(LEFT(UPPER(surname),1), '-', CHAR(ASCII(UPPER(surname))+3)) AS r, 
  count(id) 
FROM people
GROUP BY ROUND((ASCII(UPPER(surname)-65)/4),0);

这会将你的范围设置为4个字母,这意味着最后一个范围是'yz',但你可以用更多的数学来调整它。