我有一个表T1,其列ID为C1,C2和C3。我正在使用以下查询来查找重复的记录
Select group_concat(id) from T1 group by C2 having count(id) >1;
现在,我想按列C3对所有重复记录进行分组。我该怎么做? 注意:我没想到
通过count(id)> 1的C2,C3从T1组中选择group_concat(id);
我想获取所有在C2上具有重复值的记录,并仅基于C3对其进行分组,而与它们的C2值无关
id C1 C2 C3
1 a 3 A
2 b 2 A
3 c 2 A
4 d 2 B
5 e 3 C
在上述数据中,1,5是C2值为3的重复记录,而2,3,4是C2值为2的重复记录。我想要一个输出
A - has 2 duplicates (with C2 values 2 and 3 )
B - has 1 duplicate (with C2 value 2)
C - has 1 duplicate (with C2 value 3)
答案 0 :(得分:1)
SELECT GROUP_CONCAT(id)
FROM T1
WHERE C2 IN
(
SELECT C2
FROM T1
GROUP BY C2
HAVING COUNT(id)>1
)
GROUP BY C3
答案 1 :(得分:1)
GROUP BY
上C2
并确定其计数。计数大于1的C2
值基本上是重复的(出现在多行中)。C2
上的主表中。这将帮助我们获得一个额外的列,该列显示C2
对每一行的计数。COUNT(DISTINCT ...)
在C3
上使用条件聚合。尝试:
SELECT
t.C3,
COUNT(DISTINCT IF(dt.count_C2 > 1, t.C2, NULL)) AS duplicates
FROM
your_table AS t
JOIN
(
SELECT
C2,
COUNT(id) AS count_C2
FROM your_table
GROUP BY C2
) AS dt
ON dt.C2 = t.C2
GROUP BY t.C3
结果
| C3 | duplicates |
| --- | ---------- |
| A | 2 |
| B | 1 |
| C | 1 |