计算表中重复行的总数

时间:2016-06-30 06:18:34

标签: sql postgresql

我有以下SQL用于检查包含以下列的表中的重复行:idcase_idraw_nameinitialsnamejudge_idmagistrate_idscore

SELECT MIN(id), case_id, initials, raw_name, count(*)
FROM my_table
GROUP BY case_id, raw_name, initials, name, judge_id, magistrate_id
HAVING count(*) > 1;

(如果某行在case_idraw_nameinitialsnamejudge_id和{{1}中包含相同的值,则该行被视为重复列。)

如何获取需要删除的重复行总数(每组重复项剩下1行)?

3 个答案:

答案 0 :(得分:0)

试试这个:

select count(distinct column_name) from

答案 1 :(得分:0)

尝试这样

SELECT MIN(id), case_id, initials, raw_name, count(*)-1
FROM my_table GROUP BY case_id, raw_name, initials,
name, judge_id, magistrate_id HAVING count(case_id) > 1 and 
count(raw_name) > 1 and  count(initials) > 1 and  count(name) > 1 and
count(judge_id) > 1 and count(magistrate_id ) > 1 ;  

Sample DEMO只是一个样本

答案 2 :(得分:0)

重复问题通常可以用EXISTS(the other)

表示
SELECT COUNT(*)
FROM my_table mt
WHERE EXISTS ( SELECT *
        FROM my_table x
        WHERE x.case_id = mt.case_id -- exactly the same "keys"
        AND x.raw_name = mt.raw_name
        AND x.initials = mt.initials
        AND x.name = mt.name
        AND x.judge_id = mt.judge_id
        AND x.magistrate_id = mt.magistrate_id
        AND x.id < mt.id             -- but a smaller (surrogate) key
          -- If your table doesn't have a unique (surrogate) key,
          -- you can use the internal "ctid" which is guaranteed to be unique
          -- AND x.ctid < mt.ctid
        );

对于您的最终删除查询:只需将SELECT COUNT(*)替换为DELETE