我可以执行mysql命令来过滤和删除重复的条目

时间:2013-02-21 06:49:03

标签: mysql count

我的表格为linkage,其值低于

++++++++++++++++++++++++++
+ company_id +  industry +
++++++++++++++++++++++++++
+     1      +    a      +
+     1      +    b      +
+     2      +    a      +
+     2      +    c      +
+     3      +    a      +
+     4      +    c      +
+     5      +    a      +
++++++++++++++++++++++++++

是否有一种方法可以将我的行业分组以通过desc order示例进行排序。

a = count 4
c = count 2
b = count 1

然后删除重复的行业,只留下每个company_id具有更高计数的行业。


编辑1

此修改基于OP评论I wish to only have the industry with the highest count, and deleting the rest of the entry for the same company_id. say for company_id 1, we will delete the second row, for company_id 2 we will delete the forth row.

以下是我所拥有的。

++++++++++++++++++++++++++
+ company_id +  industry +
++++++++++++++++++++++++++
+     1      +    a      +
+     1      +    b      +
+     1      +    c      +
+     2      +    a      +
+     2      +    c      +
+     3      +    a      +
+     4      +    c      +
+     5      +    a      +
++++++++++++++++++++++++++
正如我们在专栏行业中看到的那样,a有最大数量,我希望每个重复的company_id保留此条目并删除所有进入的所有进入。

考虑company_id = 1。我需要删除第二和第三行。 考虑company_id = 2。我需要删除第五行。 对于id = 3,4,5,不会发生任何事情,因为它们不会重复。

所以我表中应该有的最终数据是

++++++++++++++++++++++++++
+ company_id +  industry +
++++++++++++++++++++++++++
+     1      +    a      +
+     2      +    a      +
+     3      +    a      +
+     4      +    c      +
+     5      +    a      +
++++++++++++++++++++++++++

3 个答案:

答案 0 :(得分:1)

这个怎么样?

SELECT industry, count(industry) as "total" 
FROM linkage 
GROUP BY industry 
ORDER BY total DESC

Demo at sqlfiddle


编辑1

你能看看下面的问题。

how can I delete duplicate records from my database

我认为这就是你要找的东西。

答案 1 :(得分:1)

select n.industry,count(n.industry) count from linkage n
group by n.industry
order by count desc

select t3.company_id,t4.industry from
(select t2.company_id,max(t2.count) count from(
select m.company_id,m.industry,t1.count from linkage m
join
(select n.industry,count(n.industry) count from linkage n
group by n.industry
order by count desc)t1
on m.industry = t1.industry
order by m.company_id)t2
group by t2.company_id
order by t2.company_id)t3
join
(
select m.company_id,m.industry,t1.count from linkage m
join
(select n.industry,count(n.industry) count from linkage n
group by n.industry
order by count desc)t1
on m.industry = t1.industry
order by m.company_id)t4
on t3.company_id = t4.company_id 
and t3.count = t4.count

Demo at sqlfiddle

答案 2 :(得分:1)

select t6.company_id,t6.industry from
(select t5.company_id,t5.industry,
row_number() over (partition by t5.company_id order by t5.company_id) rn
from 
(select t3.company_id,t4.industry from
(select t2.company_id,max(t2.count) count from(
select m.company_id,m.industry,t1.count from linkage m
join
(select n.industry,count(n.industry) count from linkage n
group by n.industry
order by count desc)t1
on m.industry = t1.industry
order by m.company_id)t2
group by t2.company_id
order by t2.company_id)t3
join
(
select m.company_id,m.industry,t1.count from linkage m
join
(select n.industry,count(n.industry) count from linkage n
group by n.industry
order by count desc)t1
on m.industry = t1.industry
order by m.company_id)t4
on t3.company_id = t4.company_id 
and t3.count = t4.count)t5
)t6
where t6.rn = '1'