MySQL - 从两列中查找重复数据

时间:2017-04-21 09:26:14

标签: mysql duplicates

我有一个任意大的MySQL表,其中有重复的行,但是为了确定哪些行是重复的,我需要匹配来自两列的数据。该表的修改后的片段如下所示。

mysql> select * from DATA_STATUS where METADATA_ID='6ac00785-abcd-3f4a-defg-12b8ed23abff';
+--------+------------+--------------------------------------+-------------+
| ID     | STATUS     |  METADATA_ID                         | METADATA_FK |
+--------+------------+--------------------------------------+-------------+
| 1      |          3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+
| 2      |          3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+
| 3      |          0 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+
| 4      |          0 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+
| 5      |          1 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+
| 6      |          2 | 6ac00785-abcd-3f4a-defg-12b8ed23abff |       1234  |
+--------+------------+--------------------------------------+-------------+

我想在整个表上进行选择,其中有多个相同的METADATA_ID,其中重复的METADATA_ID行的STATUS行为STATUS=3。我知道如何查询一列中的重复表,但我正在努力弄清楚如何匹配重复项和其他条件。

根据示例数据,符合此条件的行ID为1和2,但不是3。

编辑:有关澄清和TL; DR条件的其他信息

重复的总体标准是METADATA_ID > 1+--------+------------+--------------------------------------+-------------+ | ID | STATUS | METADATA_ID | METADATA_FK | +--------+------------+--------------------------------------+-------------+ | 1 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 | +--------+------------+--------------------------------------+-------------+ | 2 | 3 | 6ac00785-abcd-3f4a-defg-12b8ed23abff | 1234 | +--------+------------+--------------------------------------+-------------+ ,下面的代码段显示符合此要求的行。

ID

我希望查询在找到重复项时,或者只返回包含STATUSMETADATA_IDMETADATA_FKSTATUS是可选的)的一行,或者复制的所有实例都可以。如果METADATA_ID不是3,或jint仅在表格中存在一次,则数据不会计为重复数据。

2 个答案:

答案 0 :(得分:1)

试试这个:

select * 
from yourtable
where 
  status_id = 3 and 
  metadata_id in (
        select metadata_id 
        from yourtable
        where status_id = 3 
        group by metadata_id 
        having count(*) > 1
  );

Working example

如果不是所有行都是必需的,您可以使用这个简单的查询:

select * from yourtable where status_id = 3 group by metadata_id having count(*) > 1;

答案 1 :(得分:1)

假设您想要所有重复这些字段的记录: -

SELECT some_table.ID, 
        some_table.STATUS, 
        some_table.METADATA_ID, 
        some_table.METADATA_FK
FROM
(
    SELECT STATUS, 
        METADATA_ID, 
        METADATA_FK
    FROM some_table
    WHERE status_id = 3
    GROUP BY STATUS, METADATA_ID, METADATA_FK
    HAVING COUNT(*) > 1
) sub0
INNER JOIN some_table
ON sub0.STATUS = some_table.STATUS
AND sub0.METADATA_ID = some_table.METADATA_ID
AND sub0.METADATA_FK = some_table.METADATA_FK

我假设metafata_fk是记录唯一性的一部分