我有一个没有主键的表,我无法添加一个 - 相关的列是:
Department | Category |
-------------+-----------+
0001 | A |
0002 | D |
0003 | A |
0003 | A |
0003 | C |
0004 | B |
我想为每个Department
检索一行,它为我提供了部门代码和表中最常出现的Category
,即
Department | Category |
-------------+-----------+
0001 | A |
0002 | D |
0003 | A |
0004 | B |
实现这一目标的最佳方法是什么?我当前的尝试涉及子查询中的Count(Category)
,然后从中获取Max(CountofCategory)
,但在此阶段包含Category
字段表示返回的行太多(从GROUP BY
开始适用于Category
级别以及Department
)。在平局的情况下,我只是任意选择类别的最小值/最大值。理想情况下,这应该是与数据库无关的,但可能在Oracle或MySQL上运行。
答案 0 :(得分:3)
在Oracle和SQL Server中均可使用,我相信它是所有标准SQL,来自后来的标准:
with T_with_RN as
(select Department
, Category
, row_number() over (partition by Department order by count(*) Desc) as RN
from T
group by Department, Category)
select Department, Category
from T_with_RN
where RN = 1
编辑我不知道为什么我使用了WITH,使用内联视图可能更容易阅读解决方案:
select Department, Category
from (select Department
, Category
, row_number() over (partition by Department order by count(*) Desc) as RN
from T
group by Department, Category) T_with_RN
where RN = 1
结束编辑
测试用例:
create table T (
Department varchar(10) null,
Category varchar(10) null
);
-- Original test case
insert into T values ('0001', 'A');
insert into T values ('0002', 'D');
insert into T values ('0003', 'A');
insert into T values ('0003', 'A');
insert into T values ('0003', 'C');
insert into T values ('0004', 'B');
-- Null Test cases:
insert into T values (null, 'A');
insert into T values (null, 'B');
insert into T values (null, 'B');
insert into T values ('0005', null);
insert into T values ('0005', null);
insert into T values ('0005', 'X');
-- Tie Test case
insert into T values ('0006', 'O');
insert into T values ('0006', 'P');
答案 1 :(得分:1)
您也可以尝试以下操作。此处的窗口返回按照每个部门匹配的降序频率排序的类别。 FIRST_VALUE()从中选择第一个。
SELECT DISTINCT (department),
FIRST_VALUE(category) OVER
(PARTITION BY department ORDER BY count(*) DESC ROWS UNBOUNDED PRECEDING)
FROM T
GROUP BY department, category;
答案 2 :(得分:0)
如果你的子查询比我更好,你将不得不清理它,但在我的测试中,这产生了你想要的结果:
SELECT
main.Department as Department,
(SELECT
Category
FROM yourtable
WHERE Department=main.Department
GROUP BY Category
ORDER BY COUNT(Category) DESC
LIMIT 1) AS Category
FROM yourtable main
GROUP BY main.Department
诀窍就是让子查询中的一行用ORDER BY和“LIMIT 1”返回你想要的最大值
答案 3 :(得分:0)
有一种更简单的方法:
select department, stats_mode(category) from T ;
当只需要最常用的值时,效果很好,当你需要第二,第三......最常见的是你必须像上面那样进行计数。