Sql group by:如何获取最常出现的值

时间:2015-05-13 07:29:51

标签: sql

我有一张桌子

Id1   Id2
1     2
1     2
1     3
1     4
1     4
1     4
2     2
2     3
2     3

我需要下一个结果(每个组的最大计数为id1 id2)

1  4
2  3

编辑:

Id1   Id2
1     10
1     10
1     5
1     2
1     2
1     2
2     20
2     6
2     6

我需要下一个结果(每个组的最大计数为id1 id2)

1  2
2  6

编辑(来自Stefan Steinegger)

查询应返回每个id1,其中id2最常出现在组合中。 (希望这有助于理解这个问题。)

6 个答案:

答案 0 :(得分:4)

试试这个Sql demo

select test_2.Id1,test_3.Id2 from 
     ( select Id1,MAX(countval)as val from
               (select Id1,count(Id2 ) as countval,Id2  from test
                group by Id2 ,Id1) as test_1 
       group by Id1) as test_2 
         inner join 
               (select Id1,count(Id2) as countval,Id2 from test
               group by Id2,Id1) as test_3
                on test_3.countval=test_2.val and test_2.Id1= test_3.Id1

答案 1 :(得分:3)

试试这个:

SELECT id1,
       id2
FROM   (SELECT *,
               row_number()
                 OVER (
                   partition BY id1
                   ORDER BY cnt DESC) [rn]
        FROM   (SELECT *,
                       COUNT(*)
                         OVER (
                           partition BY id1, id2) AS [cnt]
                FROM   @table) t) t1
WHERE  rn = 1 

答案 2 :(得分:0)

使用此代码:

create table id
(
id1 int,
id2 int)

insert into id values(1,2)
insert into id values(1,2)
insert into id values(1,3)
insert into id values(1,4)
insert into id values(1,4)
insert into id values(1,4)
insert into id values(2,2)
insert into id values(2,3)
insert into id values(2,3)


select id1,max(id2) as 'maxcount'
from id
group by id1

drop table id

输出:

id1 maxcount
1   4
2   3

答案 3 :(得分:0)

您可以使用sqlfiddle.com进行测试。

构建架构:

create table test (id1 number, id2 number);
insert into test values (1, 2);
insert into test values (1, 3);
insert into test values (1, 3);
insert into test values (1, 4);
insert into test values (1, 4);
insert into test values (1, 4);
insert into test values (2, 2);
insert into test values (2, 3);
insert into test values (2, 3);
select test.*, rowid from test;

运行你的SQL:

SELECT id1, max(id2) max
FROM test
GROUP BY id1;

这是你的第一篇也是原创的帖子。您可以对de edITED部分执行相同的操作。

结果:

ID1 MAX
1   4
2   3

答案 4 :(得分:0)

您可以在子查询上使用HAVING函数,该子查询计算id2的数量。

示例:

SELECT id1, id2
FROM
  (SELECT id1, id2, count(id2) as groupCount
  FROM table
  GROUP BY id1, id2) as sub
GROUP BY id1
HAVING max(groupCount)

答案 5 :(得分:0)

以下答案基于标准SQL,大多数DBMS应该支持它们。

使用窗口聚合函数:

SELECT *
FROM 
 (
   SELECT id1, id2, COUNT(*) AS CNT,
      RANK() -- maximum count per id1 gets rank 1
      OVER (PARTITION BY id1
            ORDER BY COUNT(*) DESC) AS maxCnt
   FROM test
   GROUP BY id1, id2
 ) AS dt
WHERE CNT = maxCnt

当多个值具有相同的最大值时,您想要返回什么? RANK返回所有这些,将其更改为ROW_NUMBER以仅获得一个。

下一个版本使用旧式SQL,每个DBMS都应该支持它:

SELECT id1, id2, COUNT(*) AS CNT
FROM test AS t1
GROUP BY id1, id2
HAVING COUNT(*) = 
 ( SELECT MAX(CNT) 
   FROM
    (
      SELECT id1, id2, COUNT(*) AS CNT
      FROM test
      GROUP BY id1, id2
    )  AS t2
   WHERE t1.id1 = t2.id1
 )

这将返回带有最大数量的所有值,只获得其中一个更复杂。

如果运气好,你的DBMS支持WITH,所以你必须只编写一次聚合(但这可能看起来更好,优化器可能实际上运行它类似于前面的查询):

WITH cte AS
 (
   SELECT id1, id2, COUNT(*) AS CNT
   FROM test
   GROUP BY id1, id2
 ) 
SELECT *
FROM cte AS t1
WHERE CNT = 
 ( SELECT MAX(CNT) 
   FROM cte AS t2
   WHERE t1.id1 = t2.id1
 )