嗨,我有一张看起来像
的桌子-----------------------------------------------------------
| id | group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
| 2 | 1 | 2 | 4 | 1 |
-----------------------------------------------------------
| 4 | 1 | 20 | 2 | 1 |
-----------------------------------------------------------
| 5 | 1 | 2 | 14 | 1 |
-----------------------------------------------------------
| 7 | 1 | 2 | 7 | 3 |
-----------------------------------------------------------
| 20 | 2 | 20 | 4 | 3 |
-----------------------------------------------------------
| 21 | 2 | 20 | 4 | 1 |
-----------------------------------------------------------
方案
有两种情况需要处理。
Sortsequence
列值对于一个source_id
和group_id
应该是唯一的。例如,如果具有group_id = 1 AND source_id = 2
的所有记录都应具有唯一的sortsequence。在上面的示例记录中有id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1
。这是错误的记录。我需要找出这些记录。group_id and source_id
相同。 sortsequence columns value should be continous. There should be no gap
。records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1
。例如,在上表SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children
FROM
table
GROUP BY source_id,
sortsequence,
`group_id`
HAVING COUNT(*) > 1
中。即使这是独一无二的,但在sortsequence值上存在差距。我还需要找出这些记录。我的努力
我写了一个查询
By the way query will be dealing with million of records in table so performance must be very good.
此查询仅解决方案1.如何处理方案2?有没有办法在同一个查询中执行它,或者我必须写其他来处理第二个场景。
{{1}}
答案 0 :(得分:1)
从Tere J
评论中得到答案。以下查询涵盖了上述两个标准。
SELECT
source_id, `group_id`, GROUP_CONCAT(id) AS faultyIDS
FROM
table
GROUP BY
source_id,group_id
HAVING
COUNT(DISTINCT sortsequence) <> COUNT(sortsequence) OR COUNT(sortsequence) <> MAX(sortsequence) OR MIN(sortsequence) <> 1
可能可以帮助他人。
答案 1 :(得分:0)
尝试此查询,它将解决您在问题中提到的两种情况。
SELECT
a.*
FROM
tbl a
INNER JOIN
(select
@rn:=IF(@prevG = group_id AND @prevS = source_id, @rn + 1, 1) As rId,
@prevG:=group_id AS group_id,
@prevS:=source_id AS source_id,
id,
sortsequence
FROM
tbl
join
(select @rn:=0, @prevS:=0, @prevG:=0)b
order by group_id, source_id, id) b
ON a.id = b.id AND a.SORTSEQUENCE <> b.RID;
<强> FIDDLE 强>