MySQL如何选择每个组的第一行计数

时间:2018-03-02 01:15:21

标签: mysql sql

我有一个这样的表(简化版):

+------+-------+-----+--------------+-----+
| id   | name  | age | company.name | ...   
+------+-------+-----+--------------------+
| 1    | Adam  | 21  |  Google      | ...                  
| 3    | Peter | 20  |  Apple       | ...                 
| 2    | Bob   | 20  |  Microsoft   | ...    
| 9    | Alice | 18  |  Google      | ... 
+------+-------+-----+--------------------+

我需要通过任意一列计数行的组数据。我需要在每组中获得第一排。用户选择将用于分组的列。

如果用户选择要分组的年龄,则结果为:

+------+------------+-------+
| id   | group_name | count | 
+------+------------+-------+
| 9    | 18         |  1    |
+------+------------+-------+
| 2    | 20         |  2    |
+------+------------+-------+
| 1    | 21         |  1    |
+------+------------+-------+

列到组可以是数字或字符串。

目前我是通过此查询来完成的:

SELECT id, group_name, users_name, count(id) as count FROM (
 SELECT persons.id as id, company.type as group_name, users.name as users_name 
 FROM persons  
 LEFT JOIN company on company.id = persons.company_id 
 LEFT JOIN position on position.id=persons.position_id 
 ...
 LEFT JOIN source on source.id=persons.source_id 
 WHERE ...  
 ORDER BY if(company.type = '' or company.type is null,1,0) ASC,
 company.type ASC, IF(persons.status = '' or persons.status is null,1,0) ASC, 
 persons.status ASC, persons.id
) t1 GROUP BY group_name

但是对于新版本的mysql这个SQL stoped工作我认为在sub-select中忽略了顺序。

我知道类似的主题是错误的,但建议的解决方案不适用于我的查询。我必须加入许多表,添加多个条件并使用级联顺序,然后从每个组中选择第一行。如果解决方案将针对性能进行优化,我将非常高兴。

----编辑----

建议的解决方案: SQL select only rows with max value on a column

建议使用MAX()和GROUP BY不能正常工作。有两个原因

  1. 如果分组列包含字符串,则查询不返回第一行,而是返回每组中的最后一行。
  2. 如果我的数据集有级联顺序,我不能同时在几列中使用MAX。
  3. 我创建了sqlfiddle,其中包含完整的示例。

    http://sqlfiddle.com/#!9/23225d/11/0

    -- EXAMPLE 1 - Group by string 
    -- base query
    SELECT persons.*, company.* FROM persons 
    LEFT JOIN company ON persons.company_id = company.id
    ORDER BY company.name ASC, company.id ASC;
    
    --   grouping query
    SELECT MAX(persons.id) as id, company.name, count(persons.id) as count
    FROM persons
    LEFT JOIN company ON persons.company_id = company.id
    GROUP BY company.name
    ORDER BY company.name ASC, persons.id ASC;
    
    -- The results will be: 
    -- |ID | NAME     | COUNT|
    -- |1  | Google   | 2    |
    -- |3  | Microsoft| 3    |
    
    -- EXAMPLE 2 - Cascade order
    -- base query
    SELECT persons.*, company.* FROM persons 
    LEFT JOIN company ON persons.company_id = company.id
    ORDER BY company.type ASC, persons.status ASC;
    
    --  grouping query 
    SELECT MAX(persons.id) as id, company.type, count(persons.id) as count
    FROM persons
    LEFT JOIN company ON persons.company_id = company.id
    GROUP BY company.type
    ORDER BY company.type ASC, persons.status ASC;
    
    -- The results will be: 
    -- |ID | NAME| COUNT|
    -- |3  |  1  |   2  |
    -- |2  |  2  |   3  |
    

1 个答案:

答案 0 :(得分:0)

只需将MAX()更改为MIN()即可获得第一行而非每组中的最后一行。

要获取级联列的极值,请参阅SQL : Using GROUP BY and MAX on multiple columns。在查询的子查询部分中使用它来获取包含这些极值的行,如SQL select only rows with max value on a column中所示。

所以完整查询的形式是:

SELECT t1.id, t1.grouped_column, t2.count
FROM yourTable AS t
JOIN (SELECT t3.grouped_column, t3.order_column1, MIN(t4.order_column2) AS order_column2, SUM(t3.count) AS count
      FROM (SELECT grouped_column, MIN(order_column1) AS order_column1, COUNT(*) AS count
            FROM yourTable
            GROUP BY grouped_column) AS t3
      JOIN yourTable AS t4 
      ON t3.grouped_column = t4.grouped_column AND t3.order_column1 = t4.order_column1
      GROUP BY t4.grouped_column, t4.order_column1) AS t2
ON t1.grouped_column = t2.grouped_column AND t1.ordered_column1 = t2.order_column1 AND t1.order_column2 = t2.order_column2

由于您想对连接进行操作,我建议您定义一个使用连接的视图。然后,您可以在上述查询中使用该视图代替yourTable