我有三个表格如下:
Area (Id, Description)
City(Id, Name)
Problem(Id, City, Area, Definition):
City references City (Id), Area references Area (Id)
我想找到每个城市(名称)的问题中出现的最常见的区域(描述)值。
示例:
Area
Id Description
1 Support
2 Finance
City
Id Name
1 Chicago
2 Boston
Problem
Id City Area Definition
1 1 2 A
2 1 2 B
3 1 1 C
4 2 1 D
期望的输出:
Name Description
Chicago Finance
Boston Support
这是我尝试过但没有成功的事情:
SELECT Name,
Description
FROM
(SELECT *
FROM Problem AS P,
City AS C,
Area AS A
WHERE C.Id = P.City
AND A.Id = P.Area ) AS T1
WHERE Description =
(SELECT Description
FROM
(SELECT *
FROM Problem AS P,
City AS C,
Area AS A
WHERE C.Id = P.City
AND A.Id = P.Area ) AS T2
WHERE T1.Name = T2.Name
GROUP BY Description
ORDER BY Count(Name) DESC LIMIT 1 )
GROUP BY Name,
Description
谢谢!
答案 0 :(得分:1)
每个城市和区域的最大值应为
select C.Name, A.Description from (
select t1.City, t1.Area, max(freq) as max_freq
from (
select P.City, P.Area, count(*) as Freq
from Problem as P
group by P.City, P.Area
) t1
) t2
INNER JOIN City AS C ON t2.City = C.Id
INNER JOIN Area AS A ON A.Id = t2.Area
答案 1 :(得分:1)
这可能是解决问题的最短途径:
select c.Name, a.Description
from City c
cross join Area a
where a.Id = (
select p.Area
from Problem p
where p.City = c.Id
group by p.Area
order by count(*) desc, p.Area asc
limit 1
)
我们使用CROSS JOIN将每个City
与每个Area
合并。但是我们只选择给定城市的Area
表中具有最高计数的Problem
,这是在相关子查询中确定的。如果两个区域的城市最高分数相同,那么将按字母顺序排在第一个区域(order by ... p.Area asc
)。
结果:
| Name | Description |
|---------|-------------|
| Boston | Support |
| Chicago | Finance |
这是另一个更复杂的解决方案,其中包括计数。
select c.Name, a.Description, city_area_maxcount.mc as problem_count
from (
select City, max(c) as mc
from (
select p.City, p.Area, count(*) as c
from problem p
group by p.City, p.Area
) city_area_count
group by City
) city_area_maxcount
join (
select p.City, p.Area, count(*) as c
from problem p
group by p.City, p.Area
) city_area_count
on city_area_count.City = city_area_maxcount.City
and city_area_count.c = city_area_maxcount.mc
join City c on c.Id = city_area_count.City
join Area a on a.Id = city_area_count.Area
在city_area_maxcount
中使用的子查询在这里使用了两次(我希望mysql可以缓存结果)。如果您将其视为一个表,那么这将是一个常见的查找行与顶级值的每组问题。如果两个区域的城市最高分数相同,则两者都将被选中。
结果:
| Name | Description | problem_count |
|---------|-------------|---------------|
| Boston | Support | 1 |
| Chicago | Finance | 2 |