假设我有一张几乎没有字段的表,例如A,B,C,D
我需要按字段A分组,并从B,C,D中选择最常出现的值。
示例:
+---+---+---+----+
| A | B | C | D |
+---+---+---+----+
| 1 | 3 | 5 | 15 |
+---+---+---+----+
| 1 | 5 | 6 | 32 |
+---+---+---+----+
| 1 | 5 | 6 | 34 |
+---+---+---+----+
| 2 | 7 | 5 | 50 |
+---+---+---+----+
| 2 | 8 | 1 | 32 |
+---+---+---+----+
预期结果:
+---+---+---+----+
| A | B | C | D |
+---+---+---+----+
| 1 | 5 | 6 | 15 |
+---+---+---+----+
| 2 | 7 | 5 | 50 |
+---+---+---+----+
我看到了很多例子,如何使用COUNT(*)从一列中选择大多数出现的值,而不是在其上使用MAX。但在这种情况下该怎么办?
答案 0 :(得分:0)
查询看起来有点复杂,因为您必须为3列执行此操作。我们的想法是按照a-b,a-c,a-d的组合对每个组合的第一行进行分组,对行进行排名。这是使用变量完成的。在计数关联的情况下,返回b,c或d的最低值。 (如果需要颠倒排序,则可以更改。)最后需要再一次聚合才能将相应的值传递到一行。
select a,max(b),max(c),max(d)
from (
select a,b,c,d
from (select a,b,c,d,
@rn:=case when @prev=a then @rn+1 else 1 end as rank,
@prev:=a
from (select a,b,null as c,null as d,count(*) as cnt
from tbl
group by a,b
) t
cross join (select @rn:=0,@prev:='') r
order by a,cnt desc,b
) t
where rank = 1
union all
select a,b,c,d
from (select a,b,c,d,
@rn:=case when @prev=a then @rn+1 else 1 end as rank,
@prev:=a
from (select a,null as b,c,null as d,count(*) as cnt
from tbl
group by a,c
) t
cross join (select @rn:=0,@prev:='') r
order by a,cnt desc,c
) t
where rank = 1
union all
select a,b,c,d
from (select a,b,c,d,
@rn:=case when @prev=a then @rn+1 else 1 end as rank,
@prev:=a
from (select a,null as b,null as c,d,count(*) as cnt
from tbl
group by a,d
) t
cross join (select @rn:=0,@prev:='') r
order by a,cnt desc,d
) t
where rank = 1
) t
group by a